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. EDITOR'S INTRODUCTION * 


Nearly a decade ago the author of the present volume pre¬ 
pared and published, in this series of textbooks, what proved 
to be an important work on Tests and Measurements for 
Teachers, Its large sale and its continued popularity alike 
indicate that he then advanced the testing movement by a 
new type of test-and-measurement textbook. Taking the 
point of view that it was better educational procedure to use 
the activities and problems of the educational program, rather 
than a description of different tests with their uses and norms, 
as a basis for an understanding of how measurement tech¬ 
niques contribute to the improvement of teaching and the 
solution of the instructional problems of the school, he pro¬ 
duced a volume that has had an unusual success as a class¬ 
room text. In it he showed teachers how to select and use 
the particular type of measurement tool which could be most 
effectively applied to a given educational situation, and 
pointed out the practical significance of results obtained from 
the use of this particular method. He thus tended to give 
teachers a new conception of the meaning of the testing move¬ 
ment as involving the evaluation of instructional procedures 
and teaching efficiency, and the use of tests and testing pro¬ 
cedures in the solution of classroom educational problems. 

In the years that have intervened since the publication of 
this earlier book there has been a marked shift in the direc¬ 
tion the author then indicated. The tendency now is to 
relate courses on tests and measurements still more closely 
to the activities and problems of the school itself, in order to 
find solutions for the teaching, supervisory, and administra¬ 
tive questions that constantly arise. The emphasis today is 
placed more and more on determining the adequacy of the 
objectives, materials, and learning activities of the educa- 
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tional program, rather than on the comparison of testing 
results with test norms or group averages, Stiil more, the 
emphasis has shifted to the use of test results in the analysis 
of teaching difficulties, and their employment in determining 
how children who experience trouble in learning may be aided. 

In constructing the present textbook the author has itm- 
phasized the use of measurement in the improvement of the 
teaching and learning process, and in so doing has stfes«d 
four points of importance in present-day schoolroom practice. 

r. The tendency of the progressive movement to shift the 
attention of teachers to somewhat new obj«!ctives and to 
many new procedures, without adequate attention to the 
attainment of either old or new objectives, has created a 
difficult educational problem. This tendency to throw away 
old guides and principles, some of which had attained a high 
degree of validity through use and modification, makes an 
intelligent use of diagnostic and appraisal testing all the more 
iinperative if the confusion is to be terminated and costly 
mistakes repaired or avoided. 

2. There is an erroneous impression among some that meas¬ 
urement is related primarily to subject-matter teaching. It 
should be pointed out that measurement is not related pri- 
manly to the assumed or alleged nature of either learning or 
t ie learner, nor upon the types of objectives strt?sscd or their 
number, nor upon any particular philosophy, materials, or 
activity. It merely requires clarity at the point of objectives, 
purposes, or aims, and then aids teachers to determine, much 
more accurately than is possible through usual unaided esti¬ 
mates and judgment, whether these purposes and aims are 
being attained. 

3. The primary source of confusion and difficulty in testing. 

has led many teachers to adopt an unfavorable 
titude toward the measurement movement, is the vaguenew 
of the teachers understanding of the aims and purposes of 
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testing. Not only do diagnostic and appraisal testing depend 
for their adequacy and for their contribution to learning on a 
detailed knowledge of educational aims and purposes, but the 
selection of teaching materials and equipment, and the direc¬ 
tion of learning activities themselves, are significant and 
effective to the extent that teachers know specifically what, 
tliey are trying to do. The author has pointed out this diffi¬ 
culty and illustrated it, and has given some principles for the 
analysis of vague general objectives into implications suffi¬ 
ciently specific to serve as a basis for adequate testing. 

4. So far as intelligence-test data are concerned, he points 
^jut the value of analyzing factors of intelligence, such as 
memory, visual ability, verbal ability, and inductive and 
deductive reasoning, in contrast to the use of the I.Q. The 
latter gives merely a prediction of individual success, but is 
valueless in pointing out how failing children may be aided. 
On the other hand, if a student is failing, and his weakness is 
memory or visual ability, the teacher has a clue in aiding him. 
Obviously such a pupil should receive a different type of help 
from one who has a very good memory or good visual ability, 
but who does not understand the implications of facts nor 
possess the ability to utilize them in solving problems. 

In the present volume there is little tett description. The 
emphasis is placed on the diagnosis of teaching difficulties 
and the direction of learning at the different school levels, and 
on an appraisal of the educational program through measure¬ 
ment procedures. It does not displace the earlier text; rather 
it advances our knowledge of the use of tests in diagnosis and 
in the application of remedial teaching procedures. 

Ellwood P. Cubberle\ 
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Eight years have passed since the publication of the author’s 
Tests and Measurements for Teachers. The ensuing period 
has been characterized by a gradual shift from the statistical 
and test-description methods of presenting courses in meas¬ 
urement to approaches based more closely on the activities 
and problems of the educational program. The present 
volume is an attempt*to meet the measurement problems and 
needs of teachers on different levels, as well as to provide data 
for the solution of supervisory and administrative problems. 

Research in the field of intelligence-testing has clarified 
some of the past difficulties in utilizing the I.Q., and has 
pointed the way for improvement in obtaining and utilizing 
mental-maturity data. The necessity for analyzing I.Q.’s 
through more individual testing and new-type intelligence 
tests has been demonstrated. The emphasis in appraisal¬ 
testing has passed from the mere making of comparisons with 
norms, or other group averages, to use as a point of departure 
in investigating the adequacy of objectives, materials, and 
learning activities. 

There are signs of a gradually dawning conviction that no 
real incompatibility exists in education between idealism in 
aims, purposes, and activities, and objective evidence of 
attainment; in fact, the contrary attitude is held by a large 
number of progressive educators. The poet may inspire 
thousands with an appreciation of the sacredness of individual 
personality without ever aiding anyone to change a single 
child; measurement reveals specific facets and objectifies the 
unique elements of personality, thus providing a functional 
basis for directing growth and enrichment. Many educational 
enthusiasms have come and gone because they were based on 
mere faith. Henceforth, no educational program, however 
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ideal in conception, can hope to gain serious conskkTauon or 
acceptance if it seeks to avoid responsibility or othertv.se fails 
to provide valid evidence of its own succcM. ^ 

The present volume is not a mere revision of tlie original 
text but has been written as a new texlb<»k. Among im¬ 
portant departures are the following: Relatively few tests 
Le described- the Buros and Hildreth hihliographies, among 
Others, constimte an excellent point of ckparUire for the 
tion of tests; the original tests and test inanyais shcnikl in 
any case be studied if the student ium^s to get any adequattf 
unLrstanding of their nature and poteiUia! crintrihuliofis. 
The treatment of standardized tests has Imyi delayed until 
near the end, when a better basis for unclerstanding has 
established. Statistical concepts and computafif»ns have 
reduced to a minimum consistent with the iie«ls of tearhem 
and administrators. C'ertain topics,^u hirh ptTH^nivd roiirept- 
difficulties, have been presented in simpler language. 

In addition to those who aider! with ihc' original volume, the 
writer is indebted to many of his students for asdstanre in 
gathering evidence of teacher measiiremen! problem-H and 
needs. He further is indebted to Or. AHre norn, f»f the 
Angeles City Schools, for materials on regres-sion rfirrts in 
testing; to Dr. Alfred Lewerrmz, of the L«h Ari«el« City 
Schools, for materials related to grade plamiieiit in reading; 
to Mr. Willis W. Clark, Dr. Idi/.akuh Sulliwm, and Mr. 

D. S. Belden, of the Los Angeles County Srilirmls, for niitlerials 
related to intelligence and intelligence" testing; to Dr. iaittin 
P. Thorpe, of the University of Southern Ck'tlifornia, for as¬ 
sistance in'the field of personality and adjlist men t: and to Dr. 

E, P. Cubberley, of Stanford University, for unfailing courtesy 
and encouragement during the preparation of this inaiiusrrlpt,, 

17 oV Ef KT W*. f 


Los Angeles, Califorku 
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PART I 

A SURVEY OF THE FIELD 




CHAPTER ONE 


THE NATURE AND FUNCTION 
OF MEASUREMENT 


When the Athenians failed to take Syracuse during the second 
period of the Peloponnesian War and began, their retreat to¬ 
ward the interior of Sicily, they were surrounded and forced 
to surrender. Thousands of Athenian soldiers were imprisoned 
in stone quarries where most of them perished after weeks of 
suffering. However, history records that those who could 
teach the Syracusans a sufficient number of verses of Euripides 
in a satisfactory manner were considered worthy to live and 
their lives were spared. 

More than two thousand years later, in seventeenth-century 
England, when a drought, an illness, or a mere rumor was 
sufficient to start Matthew Hopkins on a witch hunt, some 
of the victims of this panic fared even worse. Their "thumbs 
and toes were tied crosswise” and they were thrown into 
water; if they floated they were considered guilty. 

While the inadequacies of these two tests are now obvious to 
laymen as well as to teachers, they illustrate, in somewhat 
extreme fashion, certain erroneous examination practices of 
today which involve just as great a disparity between evidence 
and conclusions- 

The purpose of this opening chapter is to sketch briefly the 
history of tests and examinations, to describe and illustrate 
certain persistent fallacies and errors which limit their useful¬ 
ness, and to explain the nature and contributions of better 
tests. 

The achievement examination idea. The examination idea 
is very old. The Chinese system of formal examinations 
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probably had its inception some thirteen centuries ago» While 
we know little about the long period which followed* there is 
no evidence to indicate that anyone seriously questioned the 
alleged merits of the examination procedure Passing directly 
to the American scene of Colonial days, we find that the oral 
examination had become the favorite method of judging the 
adequacy of learning as well as teaching. Pupils were ques¬ 
tioned individually in turn before the entire class, and the fate 
of an individual often hung upon his reaction to the more or 
less accidental nature of one or two questions. Furthermore, 
the official examiner was a school committee man rather than 
a teacher. Such was the situation which confronted Horace 
Mann some hundred years ago when he became Secretary of 
the State Board of Education of Massachusetts. 

Horace Mann's criticisms. The weaknesses of the oral 
examination as a measurement device were described at length 
and with rare insight, for the times, by Horace Mann.* He 
pointed out the following among other limitations: Questions 
differ in difficulty and merit; students examined later in a 
period may often profit by the mistakes of precreding students; 
good students may fail on very difficult questions (and poor 
students succeed on very easy ones); examiners often ask 
leading questions which suggest answers and thus defeat the 
purpose of the examination; and the time available for each 
student is usually insufficient adequately to examine anyone. 

To correct these evils, Mann recommended the general use 
of written examinations. He argued that they were impartial* 
just, and more thorough than oral examinations; that each 
student of a class of thirty would receive sixty minutes of exam¬ 
ination during an hour period, whereas each received only two 
minutes under the oral examination plan; and that the oppor¬ 
tunity for suggestion and favoritism would be eliminated. 

’ Caldwell. O. W.. and Courtis. S. A. Th^ and New in Bdueatian. 

New York, World Book Company, X933, (Quoted from Man»*8 report In the Common 
School Journal, 1845.) 
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But he went further in his claims. He insisted that once 
questions were committed to writing their ease or difficulty 
could be appraised, and that the pupil answers to these ques¬ 
tions constituted indisputable proof of the quality of teaching. 

According to his conception, therefore, examinations still 
had but one function, that of appraisal. The possibility of 
using tests and examinations to guide teaching and learning 
and adjustment, had not yet emerged. 

Evidence to be presented later will disprove some of Mann’s 
claims, but nevertheless should not blind us to the fact that 
he was the outstanding American educational leader of his day. 

The modem testing movement. The modern testing move¬ 
ment is a product of many diverse influences and contributions. 
While space will not permit an adequate historical treatment, 
a few scattered examples will serve to indicate the nature and 
complexity of its origin. 

By 1864, Fisher, an English schoolmaster, seems to have 
sensed some of the fundamental weaknesses of the methods 
of evaluation used in his day and as a result developed a series 
of more or less objective scales for various subjects of the cur¬ 
riculum, giving questions and sample answers. 

Wundt is credited with having established the first psycho¬ 
logical laboratory at Leipzig in 1879. Others followed. While 
this had no immediate relationship to the appraisal of achieve¬ 
ment, it eventually contributed in many ways to the measure¬ 
ment movement. 

Galton and later Thorndike and Cattell, among others, de¬ 
veloped many of the fundamental techniques for the statistical 
treatment of data. 

The dawn of the twentieth century saw the movement in 
full swing. Binet and Simon were using an analytical and ex¬ 
perimental approach in their study of individual e^ddences of 
intelligence. The concept of general intelligence presented 
itself and was accepted. The idea of the intelligence or mental 
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quotient seems to have occurred almost simultaneously to at 
least three different workers. 

Meanwhile, other leaders were active in the field of achieve¬ 
ment testing. One of the earliest American contributions was 
that of Rice in the field of spelling. Thorndike, in writing, 
and Stone and Courtis, in arithmetic, were other pioneers. 
In a relatively short time standardized achievement teats or 
scales were appearing in many fields. 

It was natural that some workers, enthusiastic over the pos¬ 
sibilities of these new instruments, should make claims that 
later could not be substantiated. Many in the field who under¬ 
stood neither the nature nor merits of tests rushed to give them 
in the schools because the practice became popular. But they 
did not know what to do with the results. Other school offi- 
dals made erroneous deductions regarding the adequacy of 
teaching and learning, and teachers became suspicious of the 
whole testing movement. 

Universities and colleges began offering courses in statistics 
and testing. The literature of the field was limited, and many 
of the concepts were new and difficult. The major content of 
some courses consisted of a study of the characteristics of a 
limited number of standardized tests. Other courses became 
involved in the statistical validation of tests w'hich is primarily 
a problem of test-makers. The major emphasis for nearly 
two decades seems to have been confined to the measurement 
of certain pupil outcomes at the end of the year or semester 
only. 

However, during the past few years workers have discovered 
other important uses of objective tests. Many leaders now 
believe that the most significant contribution of measurement 
lies in its functional value in the guidance of learning activities. 
With this viewpoint the writer is in full accord. 

Unfortunately, certain erroneous educational concepts and 
beliefs, though often laid low by accumulating evidence from 
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laboratory and field, remain to confound each succeeding gen¬ 
eration of teachers who attempt the study of measurement. 
The ghosts of these educational fallacies persist in rising and 
mixing with the legitimate offspring of careful experimental 
and statistical research. Before proceeding with the discussion 
of the newer and more important uses of measurement, there¬ 
fore, it is desirable to repeat some of the evidence which will 
aid in exposing and eliminating the most serious of these 
errors. 

Simple estimates and judgments. When thirty-two stu¬ 
dents of measurement were asked to estimate the length of 
an instructor’s desk, their responses varied as indicated in 
Table i. 


Table i. Estimates of Lineab. Distance 


Length or 

Number oe 

Desk 

Students 

4-' lo" 

I 

4' 8 " 

2 

4' 6 " 

I 

4' 2 " 

5 

4' 0 *’ 

8 

3' 8 " 

7 

3’5 " 

2 

3' 2 " 

5 

3' 0 ” 

I 


The actual measured length of the desk was four feet two 
inches. While several students made a correct estimate and 
several others made relatively small errors, the majority 
failed. If, in a simple situation involving a familiar concept 
which had been used for years in daily life, errors of two to 
eight inches are typical, it should be clear that similar or 
more serious errors are inevitable in estimating and judging 
the much more complex, intangible, and illusive factors in¬ 
volved in the nature of ability and learning. 

Estimating and marking citizenship. The possibility and 
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desirability of meastiring the learning sMils or tools basic to 
all learning are now generally conceded. It Is impNortant that 
students should know how to obtain rapidly and accurately 
facts and principles important in themselves, and particularly 
those needed in the solution of problems. Another important 
trend in modem education is concerned with the development 
and adjustment aspects of personality. In mci^Ition of the 
responsibility of the schools to society, outcomes of this type 
are often grouped under the general concept of citisenahip. 
However, such traits and qualities as co-operation, punctual¬ 
ity, accuracy, and disposition, which often appear on report 
cards, are not simple but complex concepts. Ratings on them 
often reveal wide variations. 

Five elementary teachers in a departmentalis&ed school were 
requested to make special citizenship ratings of a number of 
their pupils. Table 2 presents the ratings of ail five teachers 
of the same pupil. 

Table 2. Five Citizenship Ratings of the Same Pupil 


OUAl-tTlES 

. .. —, 

TKACmgSlS 1 

X 

3 

a 

4 

s 1 

Co-operation.... 

A 

A 

C 

l> 

A 

Punctuality..... 

B 

B 

D 

D 

A 

Work habits.... 

B 

A 

C 

D 

A. 

Dependability,.. 

A 

C 

D 

C 

A 

Dispoaition..... 

B 

B 

C 

B 

A 


All five teachers were familiar with the pupil rated in Table 
2. Marks in each of the first four items vary from A to Dj in 
the fifth, from A to C. Teachers Number 3 and 4 differ widely 
from teadier Number 5. Strangely enough, the report card 
issued to this pupil had five A’s in citizenship. The data of 
Table 2 support the evidence of many similar studies which 
reveal how unreliable such ratings may be. 

Estimatuig and marking achievement. The familiar May 
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examinatioii is still the favorite method of evaluating learning 
for a large part of the teaching profession, particularly on the 
high-school and college levels. It usually consists of five or 
more questions upon which the student may write for a limited 
period of time. The examiner then reads the answers and 
assigns a mark ranging from 0—100 per cent, or from Fail to A. 
The fact that letters of the alphabet or other characters are 
sometimes substituted for per cents does not alter the essential 
nature of these marks; in fact, the meanings of these letters and 
characters are usually defined in terms of per cents. Many 
studies ^ of marking have supplied conclusive evidence of the 
fact that this is a time-consuming, wasteful, and unreliable 
technique of appraisal even though it may be useful for a cer¬ 
tain type of diagnosis- Table 3 presents the results of five such 
studies. 

Table 3. Variability in Marking Five Essay Examinations 



Essay Examinations 

Marks 

Assigned 

7 A * 
Physi¬ 
ology 

H. S,t 
English 

H. S.t 
Geom¬ 
etry 

College * 
Liter¬ 
ature 

Jr. H. S.* 
Science 


I 

2 

3 

4 

S 

90-100... 

4 

61 

2 

2 

5 

80 * 89 • T 

8 

67 

18 

9 

7 

70- 79..- 

3 

II 

40 

13 

12 

6o~ 69... 

2 

2 

35 

0 

13 

50- 59... 

9 

I 

16 

8 

4 

40- 49... 

4 


2 

2 

0 

30- 39..- 

0 


I 


I 

20- 29... 
10- 19,.. 

I 


I 



0 - 9... 






No. of 
Markers.. 

31 

142 

US 

34 

43 


* Studies by the autlior. 
t Studies by Starch, D., and Elliott, E. C. 


< See educational Indexes for the work of Starch, Uaniel, and Elliott, E, C.; Ruch, G. M.; 
NettsOs, C. H.; Kelley, F. J.; Bolton, F, D., and other#. 
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The most striking characteristic of the data of Table 3 is the 
wide range of marks assigned on identical examination papers 
by different teachers. The same high-school English examina¬ 
tion was marked by 142 different teachers of Bmglish and the 
difference between the highest and lowest mark was about 40 
points. The range on the college literature examination 
marked by thirty-four teachers was 50 points; on the juniar- 
high-school science examination, 60 points; and on the phys¬ 
iology and geometry examinations, 70 points. 

In general, the larger the number of markers, the wider the 
range; in Table 3, however, the range is smallest where the 
number of markers was largest. It appears to make iittfe dif¬ 
ference whether or not the markers are especiall^^ prepared in 
the field of the examination- The examination in geometry, 
for example, marked by geometry teachers, yielded a difference 
of 70 points between the lowest and highest mark; the junior- 
high-school science papers, some of which were m«arked by in¬ 
structors not teaching science, yielded a range of 6 d points. 

Not only will different markers vary widely, but when the 
same teacher re-marks a set of tw'enty-five or thirty papers after 
the first marks are forgotten, he does not even agree with him¬ 
self, but awards some marks twenty to tfiirty points different 
from his first attempt. If, however, the marker know^s the 
students whose examination papers he marks* he is influenced 
by certain impressions he has obtained of them and tends to be 
more consistent. However, this does not mexm that hia marks 
are correct; it means only that he tends to vary less from his 
first marks, from which, nevertheless, other markers may differ 
widely. 

It is now more than twenty years * since this inevitable varia¬ 
tion in marking was demonstrated, yet teachers continue to 
waste time and energy giving and scoring numberless examina- 


• See Starch, D.. mucational Psyehohty, New York* Tht MafirnTrlaa Oampeuay* 101*4, t0€ 
an account of early atucUea of markto®. 
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tions of this type as if they were accurate measurements of 
student achievement- 

Because achievement cannot be measured by the essay ex¬ 
amination, thousands of elementary pupils who are able to do 
high-school work this year will be failL. Other thLtxt 
who are promoted to high school will fail because they have 
not attained the objectives of the elementary-school curriculum* 
Thousands of high-school seniors will be recommended to col¬ 
lege who will fail tecause they are not prepared, but their teach¬ 
ers do not know it. College students of uncertain achievement 
will receive s|>et ial honors in large numbers because marks are 
determined on the above basis. And many incompetent teach¬ 
ers, lawyers, doctors, engineers, and dentists will be graduated 
on the siiine basis. 

When promotion, honors, or graduation so often depend 
upon reaching or c*xc«iding a given passing mark, such as 6o, 
70, or 80, a difference of even five points may be fatal. When 
teachers rc^gularly differ from one another by thirty to fifty 
points on the same evidence, it is time that this wasteful and 
meaningless substitute for measurement of learning success be 
discantinutfd. 

Sources of mrmbility in marking. This inevitable spread or 
vdtMbiiiiy in marking thic’s not ari.se from the incompetence of 
teachers I'Jiit from certain other factors inherent in this type of 
marking siiuation. Some teachers assume that all questions 
arc of eriual merit t cithers attempt to weight them, giving more 
credit fcjr some th,m fr»r others. Most teachers mistakenly 
believe that they know how to frame examination questions 
of appropriate difficulty. Actually, they cannot do this; 
hent:e, on some examinations all or nearly all students are 
awarded high marks, and on others, low marks. The only 
way adci|u,itcly to measure the difficulty of examination items 
if to give titc'in to a represtmtative group of pupils and thereby 
determine bow Mircessful they are. But since typical essay 
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examination questions cannot be marked with any degree of 
accuracy, it is impossible to determine exactly how difficult 
they are. 

Different teachers often have widely different ideas of the 
correct answer to a question. For example, when examination 
Number i, Table 3, was being marked, the values assigned the 
answer to the third question varied from five to tw’enty point®. 
Teachers were then requested to w»rite what they considered 
correct answers; significantly, only a small minority were 
willing to accept any one answer submitted. 

Children and youths differ widely in emotional stability and 
motor ability. Some experience various degrees of blocking of 
the mental processes at examination time and fail to do them¬ 
selves justice; a few never really get started. The effort in¬ 
volved in formulating acceptable English sentences and para¬ 
graphs, which may be defensible as an evidence of English 
ability, prevents them from revealing some of their achieve¬ 
ment of geographical, historical, scientific, or other outcomes. 
In addition, the mere physical effort and control neccmary in 
writing over an extended period appears to act as a disturbing 
factor in orderly thinking and produces variable results. 

The traditional essay examination does not usually consti¬ 
tute a well-balanced sampling of the objectives in a given 
field. Instead, too much depends on a limited numbs'*uf alirs 
and purposes. In one somewhat extreme case, a.i e camination 
of five questions, lasting two hours, dealt with considerations 
touched in only seven meetings of a forty-eight meeting course; 
the activities of forty-one meetings were ignored. Teachers 
are usually concerned primarily with materials and procedures. 
They do not know specifically enough what objectives they 
should guide learners in attaining. Periodically they look back 
through the subject-matter units or topics covered to deter¬ 
mine what their objectives should have been. They proc«d 
then to write largely useless essay examination questions. 
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The passing-mark fallacy. Since teachers regularly award 
marks ranging from 40 or 50 to 90 on the same examination, 
it follows that the question of passing depends much more 
upon the teacher who happens to read a paper than upon the 
actual achievement of objectives which it may represent. 
From this it follows that the passing mark is not a stable point 
of reference, as usually conceived, with which actual achieve¬ 
ment can be compared, but one which moves up and down in 
response to other factors. 

The problem will be clearer from an inspection of Figure i. 



FIG. I. PASSING-MARK CONCEPT 


No teacher knows how difficult the questions of an exam¬ 
ination are even when they are carefully written. Yet often 
only the first question has been formulated when the class be¬ 
gins to write and the remainder are framed as the examination 
proceeds. 

If the questions are too difficult, the time too short, or the 
teaching has been inadequate, pupils are likely to fall below the 
passing mark of 75, as indicated in Figure i, where each dot 
represents a student. Usually, realizing that a whole class or 
any large part of a normal class should not be failed, the teacher 
then places the passing mark at some such point as L. On the 
other hand, if the test questions prove easy and most marks 
approach 100, the teacher often moves the passing mark up to 
li* However, there is no assurance that a single one of these 
marks approaches an actual measure of student achievement. 
The passing mark thus proves to be an illusion. 

Passing marks and standards of work. Since the passing 
mark, as generally conceived, has no actual existence, it fol¬ 
lows that establishing passing marks of 70, 75, or 80 is largely 
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meaningless, since they can have no demonstrable relationship 
to the actual quality of work done. Table 4 presents the 
passing marks and per cents of students receiving various 
marks, as they might exist in three different schools. 


Table 4. Passing Marks anp Stanoakps of Work 


Nwbbr of 

SrtmKNTS 

Rkceiving 

Fkr Crnt u 

RKI hIV 

.‘'Vll-i.l \ 

Pass at 

w Stt.'nMMTS at-‘ lijurii St 

MitStKS OF riU.OMff 8 

'■■■i U "l ! 1 ' 

pass at | Paw at ' 

(r) 

(a) 


i 4 i 

90% or more .. 

5 % 

12 % 

18% 

03 /o 

25% 

7% 

roa^' 

10% 

18% 

53 % 

15% 

4 % 

0% 

r'‘o' 

21 % 

56% 

20% 

2% 

0% 

1% 

too';' 

.. .I..,.. .j 

. 

80-84%. 

75 - 79 % .. 

70-74%. 

Below 70%.... 

Totals. 


Table 4 reveals that these three schools vaiy widely in the 
per cent of various marks awarded. For example, School A 
awards only 5 per cent of its students a mark of per rent or 
more; School B awardvS to 10 per cent, and School C', to 21 
per cent of its students, marks of this size. School A with the 
lowest passing mark (70 per cent) failed 7 per cent; St'ht»oI B 
failed 4 per cent; and School C, 3 per cent. 'Fhe per cent failed 
is probably more significant than the passing mark so far as a 
comparison of these schools is concerned; but this is a wholly 
false standard, since so many unreliable methods are used in 
determining marks. A high failure rate is no guarantt^e of 
high standards. 

The same general problem is involved in recommending high 
school graduates for college entrance. The fact that students 
are selected from the upper fourth or half of the class is no 
guarantee that they will be successsful. One high school in a 
■western city was censured for recommending C students to col- 









THE NATURE AND FUNCTION OF MEASUREMENT I5 

lege when other high schools were recommending only students 
earning A’s and B’s. Investigation revealed that the A’s and 
B’s of several schools constituted a larger per cent of the marks 
awarded than the combined A’s, B’s, and C’s of the censured 
school. 

As long as these traditional methods of evaluation are used, 
it is impossible to determine with any degree of accuracy the 
achievement of a single learner or of a group, hence no valid 
comparisons between classes or schools are possible. It is im¬ 
possible to be accurate or fair in promotions, honors, or gradu¬ 
ation. Discussions concerning standards are largely meaning¬ 
less because the data upon which they are based are highly 
unreliable. But most serious of all is the fact that these tradi¬ 
tional unreliable methods provide no adequate bases for de¬ 
tecting learning difficulties or unattained objectives; there is 
little hope, therefore, that an adequate program can be devel¬ 
oped or maintained whether learners are passed or failed. 

The new viewpoint in measurement. The principal function 
of measurement is the improvement of learning. This ob¬ 
jective is attained largely through a systematic diagnosis of the 
capacity to learn and the particular manner in which individu¬ 
als learn best; through a determination of immediate learning 
difficulties for the purpose of guiding instruction; and through 
periodic appraisals of the success with which learners have at¬ 
tained the objectives of the curriculum as a whole. 

Measurement is also used somewffiat less directly to appraise 
objectives, materials, and activities as the basis for desirable 
modifications, and to aid in the solution of other teaching, 
supervisory, and administrative problems. 

Measurement defined. Measurement is not synonymous 
with testing although tests are often used. Carefully devised 
check lists, systematic recorded observations, individual exam¬ 
inations, and other devices for obtaining accurate information 
are included. Some outcomes lend themselves to pencil-and- 
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paper tests; certain aspects of shopwork require the square and 
compass; music, art, radio programs, and motion pictures 
require in addition still other techniques of evaluation. 

Measurement is not something added lo the teaching¬ 
learning process which merely requires more teacher time and 
energy. It is a vital constituent element of the procem. itself 
which saves time and energy by providing the basis for a more 
adequate direction and evaluation of learning, and relieve 
teachers from monotonous repetition of activities which are 
ineffective. Far from endangering teachers or their work, it 
is their surest method of avoiding certain types of errors, and 
of protecting themselves against unfair charges of inefficiency. 
Measurement is simply a method of outwitting the fallibility 
of the senses that provides more accurate information than 
may be obtained through unaided estimate or judgment. 

Finally, measurement is not a static completed pr<x:es5S. On 
the contrary, measurement theory and practice are constantly 
being modihed and expanded in harmony with changing con¬ 
cepts of the aims and purposes of education and acceptable evi¬ 
dences of their attainment. Certain prestmt outcomes, real or 
imaginary, cannot yet be measured, hut if proof of their exist¬ 
ence is forthcoming, methods which may lx? used by teachers 
will be devised to measure them. 

Need for measurement. The unusual professional activity 
of the past decade may easily Have created the impression that 
most educational problems are practically solved. Objectives 
have been re-examined and restated; materials have lieen mod¬ 
ified in the hope of better fitting them to the needs, interests, 
and capacities of children and youth; and an almost unending 
variety of educational activities has been produced in the belief 
that they are more effective than older procedures. 

However, a study of the work of the schools tliemselves does 
not bear out this impression. It is impossible to tell how 
effective many activities are because no one has ever attempted 
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to measure them. Mere variety in activities is no cause for 
alarm, but many apparently conflicting concepts of aims and 
purposes, as well as activities, often exist in the same school 
system and in the same school. The present organization of 
work and the teacher load make it difficult, if not impossible, 
for teachers in many schools to obtain adequate information 
concerning the problems and difficulties of individual students. 
Much time is consumed on activities related to objectives 
already attained and activities already proved ineffective are 
repeated endlessly in the hope of aiding students who experi¬ 
ence learning difficulties. 

School reports indicate that hundreds of thousands of appar¬ 
ently normal school children and youths still fail of promotion 
in America each year. Many of those who "failed’' last year 
do well this year; many who received good marks last year 
have this year become “failures.” In fact, if at the end of any 
semester after the failure lists have been completed, a differ¬ 
ent but equally competent teacher evaluated the work of 
each class, scores of thousands who were failed by the first 
teachers would be promoted, and other scores of thousands pro¬ 
moted by the first teachers would now find themselves on fail¬ 
ure lists. 

It is generally believed that a major purpose of education is 
defeated through the effects of failure on the development of 
personality; also, an enormous economic waste results from 
paying a second or third time for the same service. Yet just 
as serious, perhaps, are the effects of promotion and apparent 
success on those who have failed to attain, adequately, the 
objectives of the curriculum. 

The solution is not, therefore, the immediate adoption of 
non-failure programs. Such a step simply avoids the issue. 
A defensible non-failure program can be achieved only by pro¬ 
viding the training and working conditions for teachers which 
will make it possible for them to detect and eliminate learning 
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difficulties and thus guarantee a student attaifiment of ob* 
jectives which will merit virtually universal promolioru 

Diagnosis and appraisal. Diagnosis may lx* defined as the 
location and identification of the causes of learning or acljuat- 
ment difficulties; appraisal* as the measurement of the success 
with which the objectives of the curriculum have tieen attalfiixL 
In one sense, therefore, they have rather distinct fiineticHis, 
In another sense they are closely reLiled, l‘»ecau:st'* the appraisal 
of attained objectives is often the first step in rhagnosis since 
it locates the general areas in which more sjMxdfic difficulties 
exist. Part II of this volume. Diagnosis and the Direction of 
Learning, includes a large use of appraisid in this ^nse. Part 
III, on the other hand, is concerned primarily with the ap¬ 
praisal of outcomes themselves, and the implications of rite 
obtained measurements in the more general management and 
direction of educational activities. 

The basic technique in both diagnosis and appraisal in edu¬ 
cation is the objective test. These two functions will l>e 
treated together in the discussion which follows. 

Objective or new-type measurement. The olqertive, or 
new-type, test is so constructtxl that all who mark it obtain 
the same result. This result, once obtained, represents a much 
more stable and accurate measurement of ability or achieve¬ 
ment than does the traditional teacher’s mark, and may thus 
be used for making reliable comparisons. Furthermore, siii« 
teachers agree on the marking of each item, an insfxx'tion of 
the test results reveals specific learning difficulties and unat¬ 
tained objectives which may be used to guide further learning. 
Whereas the traditional essay examination consists of a few 
more or less general questions which may be answ’ered in many 
ways, the new-type test consists of many relatively spxjcific 
items, each of which usually requires a definite and unequivocal 
answer. The following are samples of the most commtmly 
used types of objective test items; 
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1. True-false type: 

If teachers were highly efficient all students would attain 

the objectives of the curriculum. (Underline true or false.) X K 

2. Multiple-choice, or best-answer type: 

The essay examination should be discontinued as a device 
for determining student success because (select the best 
answer) 

a. It is being criticized. 

b. It yields marks which are too high. 

c. The marks obtained are not measures of the ability or 
achievement being tested. 

3. Completion type: 

The typical essay examination does not include a well-balanced 
sampling of-(Complete the statement.) 

The objective, or new-type, examination is based on the 
assumption that teachers should know as definitely as possible 
exactly what important facts, principles, generalizations, skills, 
ideals, attitudes, and adjustment patterns learners should 
attain or develop. Then, regardless of whether they are sub¬ 
ject matter minded, or teach by the topical, the most radical 
unit organization, or some other method, they can better select 
subject-matter and adjust procedures and activities to attain 
these objectives. Thus, the construction of the new-type test 
corresponds much more closely to the nature and needs of the 
learning situation than does the traditional method of evalua¬ 
tion. Furthermore, since it does not control the type of teach¬ 
ing or learning, but attempts to measure only the ends attained, 
it is not in conflict with the concept of the learner as an inte¬ 
grated functioning personality. 

The point score. Since the answers to each of the items of 
the new-type test must be definite and specific, no matter how 
complex the mental activities which are called into action in 
determining them, it is possible to indicate the success or fail¬ 
ure of the student on each, and to award appropriate credit. 
In short objective tests given primarily for diagnostic pur- 
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poses, relatively little attention is given to scores, the objective 
being to obtain the Hst of unattained objective for the class 
as a whole, as well as for each student. In longer tests given 
for measuring the success of the student on a unit, topic, sec¬ 
tion of a course, or a whole course, a definite credit, usually 
one point, is given for each correct answer. 

In the traditional per cent system of marking, the higher 
the quality of the answers, as Judged by the marker, the nearer 
the mark approached lOO. However, the si^e of the point 
score in objective testing does not reveal its significance. For 
example, a point score of 6o would be meaningless until com¬ 
pared with what the class as a w'liole averaged on such a test. 
The score would attain additional significance if it was learned 
that the student had responded correctly to ever>' item. On 
the other hand, a point score of 95 or ux> may be very poor, if, 
for example, the test contains 300 items and the class average 
is 175- 

The norm vs. the passing mark. We have already seen how 
the varying difficulty of examination questions, difierences in 
length of time allowed for writing examinations, variations of 
opinion of what constitutc^s correct answers, and other factors, 
lead teachers to award marks differing by as much as 50 or 60 
points on the same examination. This would refute any claim 
that a mark of 75 or 80 represents a definite quality or cjuantity 
of work, or that such marks have any sure value for compara¬ 
tive purposes. In objective testing procedure, moat of these 
difficulties are eliminated and relatively stable points of com¬ 
parison can be established. 

To measure all attainments of the learner is a practical im¬ 
possibility, hence testing should be regarded as a sampling of 
these attainments. The more comprehensive the s^impllng 
and the greater its fidelity to the objectives which should guide 
learning, the nearer will the score on the test approach a true 
measure of the trait, quality, or particular achievement toeing 
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evaluated. When an adequate test is given to a large, un¬ 
selected group of learners of similar age or grade, the average 
score earned by this larger group, known as a norm, may be con¬ 
sidered representative of the achievement of such learners in 
general- When a class earns an average score equal to this 
norm, the class is said to be up to norm. This indicates that 
approximately one-half the class received scores higher and 
one-half lower than the norm. 

Because each item of the test has only one correct answer 
and because the writing time can be indicated, the difficulty of 
the test items determined by tryouts, and other safeguards 
used, the norms of carefully developed tests have significant 
value for comparative purposes, which increases as the items 
of a particular test coxistitute a well-balanced sampling of the 
objectives of a particular curriculum. 

Reliability and validity. A test is reliable when it yields the 
same results consistently. These results may not be correct; 
reliability in its technical sense means mere consistency and is 
not interpreted in its popular sense which includes the ideas of 
dependability, truthfulness, and the like. The essay examina¬ 
tion lacks reliability because different teachers obtain results 
which vary greatly. Validity, on the other hand, does mean 
truthfulness. A test is valid when it measures what it is de¬ 
signed to measure, provided it is used for that purpose. 

Objective tests yield scores which are highly consistent, 
relatively, when compared with the marks of essay examina¬ 
tions. The carefully selected items of a comprehensive stand¬ 
ardized test, furthermore, 3deld scores which are truer measures 
of the traits and qualities than are the marks of essay examina¬ 
tions; hence the objective test is more valid. In general, valid¬ 
ity is interpreted to include reliability, since a valid test must 
of necessity be reliable. A test may, however, be reliable with¬ 
out being valid as when it is consistently wrong or untruthful. 

Types of norms. Age and grade norms have been the most 



22 


A SURVEY OF THE FIELD 


commonly used types- Age norms for a given test are derived 
by giving the test to a representative group of appropriate age 
range, dividing the total group into smaller groups on the basis 
of age, and computing the average score for each succtjoding 
age group. Usually, such groups are divided by month inter¬ 
vals. Thus, if the pupils who are p years, 6 months old chrono¬ 
logically earn an average score of 117 on a reading test, tho« 
who subsequently earn scores of i r 7 on this teat are said to have 
reading ages of 9 years, 6 months. 

Grade norms are derived in the same manner, except that 
pupils are divided into grade groups (usually by month inter¬ 
vals) and average scores for each succeeding group computed* 
If students in the third month of the sixth grade make an aver¬ 
age score of 128 on an adequate arithmetic test, those who 
subsequently earn scores of 128 on this test are said to have 
grade placements of 6.3. 

Either grade or age norms may be in percentile form; that is, 
they may be organized in tables to show what piyrcentage of 
each age or grade group has reached or (;*xceeded various srores* 

Recently, other types of norms and scores have come into 
more general use. Scaled scores, standard scores, T“.scores, 
and other methods of indicating achievement and making 
comparisons between the results of different tests Kometimes 
accompany or supersede regular age and gra<ie norms. Since 
their derivation is rather technical, they will receive considera¬ 
tion in Chapters Fourteen and Fifteen on “Nature of Valid 
Measurement.' 

Types of tests. It will be helpful, before going on to the 
next chapter, to obtain a somewhat complete picture of the 
types of tests most often used by teachers, supervisors, and 
administrators. Each of these types will receive more ex¬ 
tended consideration in connection with the services which 
they render in guiding learning and appraising educational 
outcomea. 
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I. Informal objective tests. These are primarily teaching-learning 
tools. They should be based largely on the specific objectives of 
the curriculum and serve to guide instruction and motivate 
learning. 

II. Standardized tests. 

A. Tests of learning skills 

1. The diagnostic-survey type. This type of test samples the 
essential tools of learning. It is superior to informal ob¬ 
jective tests because its items have been more carefully 
selected and developed and it is more comprehensive, hence 
it is more efficient in detecting unmastered skills which are 
causing or may cause learning difficulties. In addition, it 
yields the grade placement or age to which the pupil’s mas¬ 
tery corresponds. The results are useful both to teachers 
and administrators. 

2. The survey type. The major function of this test is to 
obtain the grade placement or age to which the skill mas¬ 
tery of a class or a pupil corresponds. It is useful prima¬ 
rily to supervisors and administrators. 

B. General achievement tests 

1. Subject tests such as in history, English, and science. The 
major purpose of these tests is to determine the level of 
achievement in these fields. They are used by both teach¬ 
ers and administrators. 

2. Achievement batteries. In these tests a number of content 
subjects are frequently combined with skill tests of the sur¬ 
vey type. A measure of the level of achievement in each, 
as well as for the combined total, is obtained. The results 
are used by both teachers and administrators, but prima¬ 
rily by the latter. 

3. Instructional tests. These are usually series of short 
achievement tests on the separate topics or sections into 
which a subject or course has been divided, and for which 
norms have been developed. When well made, they con¬ 
stitute samplings of the most important objectives to be 
attained through each topic. They perform the same gen¬ 
eral learning guidance function in this field as do the diag¬ 
nostic-survey tests of the skill type in guiding educational 
activities, in mastering the tools of learning. They are 
used only by teachers. 
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C Intelligence or mental-maturity tests. These teat# may of 
the individual or group type. To overcome language, age, 
racial, and other handicap®, they are developed t^oth in verbal 
and non-verbal form. The major purposes of such tests arc to 
discover capacity for learning and to predict probable achieve¬ 
ment. The results are used by both te«ber» and adminis¬ 
trators. 

D. Personality and adjustment profiles and invcntemea. These 
instruments are designed to identify and use certain factors 
important in learning and adjustment usually designated as 
iniangtbles because they defy appraisal or diagn«i» by meai» 
of ordinary ability, skill, or achievement tests. Results are 
useful to teachers, supervisors, and administrators, but are 
used primarily by teachers. 

The services of measuremeiit. In conclusion, therefore, 
the major purposes of measurement are (i) to aid teachers bet¬ 
ter to direct learning activities by locating and analyzing learn¬ 
ing difficulties, and (2) at certain points along the way , to make 
valid appraisals of the success of pupil-teacher efforts. The 
fact that many measurement devices are not perfect and that 
others are relatively undeveloped does not Justify a continu¬ 
ance of traditional attitudes and wasteful practices. Diagnos¬ 
tic and appraisal testing supplant relatively uncertain and«iftc?r» 
erroneous impressions with accurate and useful information. 

Measurement is not dependent upon the assumed or alkgctd 
nature of the learner or learning, but is concerned with the 
objectives of educational activity and the evidences which are 
acceptable as proof of the attainment of these objectives. 
It is not dependent upon the number or types of objectives 
involved in educational activity, but merely requires clarity 
at the point of objectives to determine exactly what teachers 
are attempting to do, and renders its services in these terms. 
Measurement is not dependent upon or closely related to any 
type of materials or procedures. Instead, it reveals informa¬ 
tion regarding their effectiveness in the attainment of whatever 
objectives are set up. 



THE NATURE AND FUNCTION OF MEASUREMENT 25 

QUESTIONS AND EXERCISES 

1. Give several examples of disparity between, evidence and conclu¬ 
sions in teaching and supervision. 

2. In what way, if any, do you disagree with Mann’s position on the 
written examination? 

3. De.scribe briefly the rise of the modern testing movement. 

4. What is your explanation of the fact that linear distances cannot be 
estimated more accurately? That character traits cannot be esti¬ 
mated more reliably? 

5. To what extent do you disagree, if at all, with the author’s inter¬ 
pretation of the data of Table 3? 

6. Make a list of the sources of variability in marking. Add as many 
as you can to those presented. 

7. If possible, obtain a set of essay examinations, number them, and 
on a separate paper note the mark which you would award. Have 
several friends do the same. After two or three weeks, repeat the 
process without reference to the first marks aw^arded. Report the 
results to the class. 

8- How would you explain the fallacious nature of the passing-mark 
concept? 

9. How are passing marks related to standards of work? In what 
respects would you differ from the point of view of the author? 

10. Define measurement and indicate its principal functions. 

11. Make a list of the problems and difficulties of teachers and admin¬ 
istrators which measurement may aid in solving. 

12. Define diagnosis and appraisal; explain their relationship. 

13. Explain: Objectives test; test item; point score; reliability; validity, 

14. Contrast the norm and the passing-mark concepts. 

15. Explain what is meant by age and grade norms. 

16. Describe the types of tests most often used, 
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PART II 

DIAGNOSIS AND THE DIRECTION 
OF LEARNING 




CHAPTER TWO 


DIAGNOSIS OF ABILITY AND 
APTITUDE 


Most teachers are now familiar with the general concept of 
intelligence or mental maturity. They believe that children 
and youth differ in capacity to learn and they desire to adjust 
educational activities to these differences. They are aware, 
in a general way, that differences in intelligence are related to 
school success as well as to the development of rich, well- 
balanced, effective personalities; they want to co-operate in 
using such measures as are available in the improvement of 
learning. 

However, their information is usually restricted to a knowl¬ 
edge of the I.Q/s (intelligence quotients) of their students. 
They have heard authorities disagree on its nature and its 
significance. Their own experiences with the I.Q. have 
divided them, some making unfounded claims for its useful¬ 
ness and others refusing to recognize in it any values for edu¬ 
cational guidance. Unless, however, the factors which 
produce these differences of opinion are clarified, there is no 
more chance of understanding the uses and limitations of the 
I.Q. than of reaching an agreement on whether or not red is 
a good color. 

Purpose of this chapter. The purpose of this chapter is 
not only to present the major concepts and units of measure¬ 
ment which are commonly used, but also to describe in some 
detail their nature and derivation, in order that teachers may 
be more discriminating in their interpretation and use of in¬ 
telligence-test data. The determination of mental ages (M.A.) 
and I.Q.’s, to be described shortly, usually constitutes the 
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first step in the guidance of learning. From them is obtained 
a general idea of the ability of the class as a whole and the 
nature of the task which confronts the teacher. However, 
intelligence-test data should not only provide a prediction of 
probable achievement* but also reveal how individual students 
learn most effectively. Only in this manner Is it p€«sibie for 
teachers to offer effective guidance when students are having 
learning difficulties. The desirability and n«‘ce»sity for this 
analysis of the nature of each student’s ability will fiecome 
clearer as the difficulties in the use of the I.Q. or M.A. are 
presented. 

The nature of intelligence tests* Group intelligence tcrsfs 
are very similar in appearance to ordinary objective achicvt?- 
ment tests. They usually include items in arithmetic and 
reading, following directions, analogies, classifications, practi¬ 
cal judgment, general information, and the like. While there 
is some variation in scoring, as in the case of achievement 
tests, the common practice is to allow one point of credit ft>r 
each correct response. The credits are added to obtain a 
total score both for parts of the test and for the test as a whole. 

In general, individual intelligence tests use similar tasks 
but are organized differently. Instead of determining the 
level of intelligence on the total score, as in the case of most 
group tests, the intelligence level of each task of an hidivldual 
test is derived and credits are combined. The examiner 
watches the student’s reaction to obtain important infonria*- 
tion aside from the general level of intelligence. 

The mental age (M.A.). The M.A. is a basic concept in 
the measurement of capacity to learn. It is derived from 
the test score of an intelligence test and is defined as intelli¬ 
gence or mental maturity which is equal to the average intel¬ 
ligence or mental maturity of a particular chronological age 
(C.A.) group. Thus a mental age of 84 months means mental 
maturity equal to the average of thc»e who are 84 month* 
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old chronologically- If on an adequate intelligence test a 
large unselected group of pupils who are 92 months old chron¬ 
ologically make an average score of 175, pupils who subse¬ 
quently take this test and make scores of 175 are said to have 
M.A.’s of 92 months regardless of their chronological ages. 

The intelligence quotient. The I,Q. is the ratio between 
mental age and chronological age. A student 8 years old 
(chronologically) who has a mental age of 12 years would 
have an I.Q. of 150.* The M.A. indicates the level of mental¬ 
ity, while the I.Q. reveals the rate of mental growth. An 
I.Q. of 150 indicates a rate of maturation 50 per cent above 
average. 

Difficulties in using the I.Q. In the past relatively little 
use has been made of the M.A. except to obtain the I.Q. The 
latter has been and still is the favorite unit of measurement 
both of those who believe it useful and those who periodically 
attack and attempt to vanquish it with invective. The 
debates pro and con center around the constancy and the 
reliability of the I.Q., the differences between I.Q.'s obtained 
from different tests, and what the I.Q. really represents. 
Each of these problems will be explained briefly. 

Constancy of the I.Q. After schools began to use intelli¬ 
gence tests, teachers soon realized that on retests students fre¬ 
quently obtained I.Q.’s varying from those of the first tests 
by from five to ten points and sometimes as much as fifteen 
to twenty points. Many had held that the I.Q. was constant 
throughout life. “How,” they now asked, “can we ever be 
sure that we have the right I.Q., and of what use is a single 
measurement in planning educational programs? ” 

This discovery lead to many studies on the constancy of 
the I.Q. In general these studies may be interpreted to indi¬ 
cate that the I.Q. is relatively constant. Terman = retested 

* Actually 1.5, but divisions are carried to two decimal places and decimal points dropped. 

» Terman, T. M. The InlelHgence of School Children, pp. 138-44- Boston, Houghton 
Mifflin Company, rprp. 
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ticHi.* This regression effect, or tendency for measurements 
of LQ.% im well as of achievement) to regress toward the 
true measure, as w»eli as actual changes in pupils between 
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obtained from group tests, or a check of a group-test I.Q. 
with an iiitlividual test, is much l>etten 

ShotiM the I.Q. he used? In view of the errors of measure¬ 
ment involved in obtaining any I.Q., and of the fact that 
different tests may yield different LQ.% for the same pupil, 
florae teachers would like to dist'ontinue the practice of 
urtitg intelligence tuitirely. It is their contention that LQ.’s 
are of «o value. But they overkwk several irai.>ortanl con¬ 
siderations. 

First of all, the size of the probable error, or the P.E., can 
l» determined and, w^hen desirable, the comparability of 
I.Q.*s from different tcests can t>e increased. Secondly* such 
teachers ignore the fact that in spite of the difficulties of 
measurement, I.Q.'s obtained from intelligence tests are much 
more reliable rind valid than the t^stimates and judgments of 
ability tvhirli they would sutetitute fr»r measured I.Q.’s. In 
the third place, they ignore the fact that they have teen 
misusing the LQ. in attempting to make it primarily an 
instrument of jndivielual guidance. The evidence* which fol¬ 
low-,s indiraies that I.Q.’s are of very limited value for this 
purpo'M*, Init that they do imetvc a very imjmrtant function in 
group Kuidanre, 

The LQ. and school marks. iMany tctachers assume that: 
thr relftiitmfthip te*lwwfi I.Q. atKl success in school in pcfrCect 
or nearly ««, That, this idea is erroii«s>«s i« revealed in 

Taiile 5. 

An iiisfmTion of Table 5 reveals that while one student 
with a« I.Q. of U’*o to 109 fatkid. two students with I.Q.’s 
of 80 to 89 m*eiv€*ci A’s. Thcjse who rtxeived B’s ranged 
from 70 in 150 and atewe in I.Q. It is evident that a student 
Willi ill! LQ. of lijci to 109 might receive any mark from A to E, 
even though the rluifires are 77 out of 170 that it ’would lie 
a II. WItife in general pupils wdth higher I.Q.’s receive better 
fiLifkis lliaii pufiili with lower I.Q.’#, the fact remainis that a 
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assigned more intelilgrnily; *iJjwv»ioii afid atlriiiiiirtraifefi 
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achievement or 8tjcrc*»i. 

When a teacher m rofifrontcti %iiih «fiiclrfit* who arr fi«it 
auc«*c‘diiig, she t# not interruSwi primarily in thr |irrtisrii«wi of 
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proljabk achfevement as indicated by the I-Q., hat in the 
rmmns fnr ihr existence c>f learning difficulties, in order that 
»lie may aid slwients in solving and eliminating them. In 
other %vcirda, .she must analyze ihcjse factors which have ct)m- 
hincil to prciclijre the I.Q, in the first placr!, and the expression 
cif which ill a ultiglf* LQ. »c:ore has in the past ohscttrcd rath^ir 
than rlarificfi the guidance' problem. Rec^ent research * has 
done niiicli to clarify this situation and reveal that such factors 
aa fx*i"rc*pl«al memory, comprehension, mathematical 

atjilily, incluciive ami dtxJyctive reasoning, and visual ability 
are combined in varitjus w-ays to prcxluce diffcwnt as well as 
identical LQ/s, If two students of the stme chronolofiica! 
age rani ideiiiical scaires cd 125 on a group inielligencci lest of 
tlirit; liiificlrwl iieiris, tliey would have the wime I.Q. But 
they would not «*riSinarily sueca^ssful on exactly the sjirae 
itcfim. In farl. it would tlicroretlrally Isc; ixissible for each of 
fhein to re^iwifid rorr«*«-ily lo one humlrtxi and iwenty-five 
items without dupltratirig a single one. If one of two wtores 
of one liusidrwl and twenty-five items is basixl largely on 
fierrepliial aiKl tiimicsry items, and the other on mathematical 
and rraii«,ifsing ilenis, it is clear that even idt*t»tic,d LO.% do 
not rnraii tlie same tiling. If Ihesir twiafttudeius %vere having 
ieaoling dilTirisIlic’S, il is obvious that they waaih'l nmxt different 
fyiK’*# erf a»*i»iarire. 

To Im* liMrful ill the «liagw»’m and wdutitui «rf learning dlfli- 
rillliw, i{|l«‘'llige!ir«'*»U»si tUn.i must reveal the %%'ay» in which 
cliffemit i»iiirient.ii learn and the reasons for failure to learn 
effet'tivefy. It «i«si aliwr* provide a prrtlictiori of prcibabie 
feirntfig wirciMsR a.s furnished by the M.A, ami LQ. 

Among Ollier alternatives, individual guidance may tw im- 
pr«w«d tlirciiigh the oac* of individual intelligence tests aitd 
new-tyfie grotip trats, Karh of thes<? wall hts c^xplained. 
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slnglerfortn te«t*. The twt has. I«fii gre^itiv 
Whereas the cirigifial Eifi« »rale ro#t«sf«| erf frff 3 ,‘.f»ur 
and the origiruil Stan font-Bi net ninety, the mx7 trf- 

Binet indudes one huiidrefl and twrtiiy-niftr I#* .„|4k 

tion to their «»ef«lne« in guidance, fo>ih yhrfd ,\| 

8, anfl stanciarcl scores. They take 
effects in iet«»tiog by giving »|«bi!iiir#fcirdilferefs< I n 


* Timtwtt, L. t 
f, *«?, 




DIAGNOSIS OF ABILITY AND AFTITUDE 4I 

The foflowiitg is an excerpt from the Year 111-6 level, Com- 
prehensitMi I test. Form L, the purpose of which is to ascertain 
whether the child rxin comprehend the situation suggested 
and give a reasonabiy pertinent reply: 

WTat nitist you do when you are thirsty? 

wpsasM.* “ Drink.** " I a»k mama for »me 
water.** — "Mama give me a drink when I go up/’—**'T>rinfc 
wjfiie water.**-”• ** liave «3me tea.**— Water.” *T drink last 
mglit.**' **Ck» to the rink.'*“Tell my mommy/*(Q.) 
”Stie*!f give me a drink/* etc. 

I''nmiisfmi4»ry> reMp^nsa: “Thank you/'-—'T’m hungry/*”™ 
“Eat.** “lial dinner, don't I?** — - " Do nothing/* etc. 

The follow’ing excerpt is from the Superior Adult III, 
Form Its purpose is to test orientation. The subject 

is given a card with the following statement and allowed to 
solve it C^'ithout pwndl or paper) while the examiner reads it: 

“ I dmve w*«t for two miles; then I turned to my right and drove 
norih for half a mile; then I turner! to my right again and drove 
f,w"o milei further. What direction was I going then.^’* 

.'Vfter Ihit reApinse, the examiner asks, **ilow' far W’as I from my 
aliirting'pjint when I «top|»ed?*’ 


The corrtjci anawer to the first question is ** East,*’ and the 
i«cond, ** Half a mile.** 






FUm. J. £EV FOR FIC'TFRf-; ??<>'* 







Test 6 

IHntti&m: Draw a ring around the letter J? under all right hands 
and feet» Draw a ring around the letter L under ail left hands and 

feet. 
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itadenl* in cwercoiaini Irarning clifllctillie*. 4 ii aa 

««fpt from ttie ft|Mci,il-rel,aw»i» wm'tmm of flir CAlifiifiila 
T»tof Mental Maturity- Elennenlari' ILstirry.* Tlie 
of the action i« to t»l abilifio* rrlal«l l« r«wRlafkai in liam 
mm4 space, wiiicli an? inifMirianl in giving atwi ffillwwlng firi*- 
liOTS. Pupiln »'iKi t«?vt low in llisfr ari4 rrlaleci tf»t» fr«i»«nly 
have difficulty in tti«fer»iancltiig aiwl m 

working directioris, hence they licecl w|wi*il usmsIiiik^ « 
tb« pinnm. However, without ilir itikkiwe *4 well a#i an*- 




a whote to aid in forecasting the expected achievement and 
ixi determining the success of the quarter’s or semester’s work. 
The factors formerly hidden in the I.Q. which provide guid¬ 
ance in helping children overcome learning difficulties should 
be retietermint'd sufficiently often to provide the needed 
guidance. Entering pupils, failing pupils, problem pupils, 
maladjusted pupils, and, in general, pupils who deviate 



QUESTIONS AND EXERCISES 

1. Wliy ha%’e teachers disagreed on the values of intelligence-teat 
results, imrtirularly the LQ.? 

а, liow are the Kf.A, and I.Q. determined? 

3. What b meant by the constancy of the I.Q.? How constant is It? 

4. Explain what is meant by the P.K. of an LQ. Interjjrct a P,E. of 
4 for an I.Q. of 110, 

5. Why do different intelligence tests yield different I.Q/s for the same 
pu [ut? 

б, What relation does the LQ. fx?ar to individual adnevement.^ 

7, l>i*r«s® in what rtjnnertkms the I.Q. and are mast ustduL 

ft. i’dmiM the limitations of the LQ. and M.A. in individual guid¬ 
ance, 

9. Wlwt ia meant by diagnt^tic analysis of intctlligcnce or mental 
maturity? 

10. Kxfikin how an individual intelligence test may provide a diagnostic 
afiitlysisj a group twt. 

i I. Develop a program for intelligence-testing, from nursery school tO' 
college. 
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CHAPTER THREE 


THE NATURE AND TYPES OF 
INFORMAL DIAGNOSIS 


Ihfoimal objective teslB and other informal diagnostic 
prcxied«r« are used primarily to evaluate specific elements 
of achievement, in order to guide the learning of all pupils 
more effectively and to identify the specific obstacles of those 
who are having learning difficulties. This is a distinctly differ¬ 
ent function from that performed by the traditional intelligence 
l«t, which by means of the M.A, and LQ. attempts to predict 
general school succcfss. It is also different from the functions 
performed by the individual intelligence test or the new-type 
group intelligence tests, which, in addition to a general predic¬ 
tion of success through the M.A. and I.Q.* reveal the probable 
reasons for the oc'currence and pc’rsistence of these learning 
difficultitf# and suggwt the manner in which such obstacles 
may be avoided or eliminated. The informal test thus locates 
the difficulty. The individual and new-type group intelligence 
tests explain It and furnish the data for its solution. 

Measurements of intelligence or mental maturity are rela¬ 
tively stable and need rechecking only periodically. In con¬ 
trast, informal diagnosis is an almost continuous process which 
furnishes the data for guiding the attainment of a constantly 
changing complex of facts, principles, skills, understandings, 
relalionshi|», ideals, attitudes, and habits related to the 
major aims and purjxjses of a unit, topic, or course. The 
effectiveness of this continuous process of informal diagnosis 
is dependent upon an understanding of two other major factors, 
namely, the problem of individual differences and the problem 
of objectives. SInw the uie of measurement in improving 





pr«Milecl in wmm’ flruil 

IwlifWaifcl dfflerwcef. Tlir Can tit.ai mm 

Mated in the mme c”l«.wwi'w*fii «* ai«giic*l in «fiw iradf* tk 
no prc»f td iiieif fioiiif^rfKfiiy* and tm stmmmt d 
formal or ptanciai'diietL *111 nwkr sliefii itwsfr alikf*. 
on tlie <»nlraiy, reveal* tlir iiirvii4l»fe MIe-frmm 
tliOT. If a group «f in tpe-llipf. a t4 f»d'4rmm 

til maiheraiitfe, or the #igniftrat»ri? d a fwovrnirnl to himtwf 
be u.«l a» tile Imuis for a clas# l»«, mkh’ «lif!rf'ri«»» in acliwr- 
.merit m^ill rwwli; in fad, ihr %l%Hvsm\tf m acftiewtimii I*- 
the Inal and fwsrwt Minlrnt m-ill rdlen I** llitw to 
tim« m great rw the «SifferriK:e lirfm'crii the .ivrfagf* 

iii.fent« of two »iicr«:eiiifig grafk"*. 

__ .1 __ 



Taljfe 7 reveals tfial lit a fiiigle filiii %fMh fW'fHaJ m,i*on*-y 
varied from eighty lc» rinr bniirlfc'd anil wssiy .x 

tion of over *w yeMn. TItr clilfrmitT* lwt«r*rn tli*- 
of i«cc««iing grade# i# aktwi onr year d mrri>k .«•* !n 
reading ability, a sBiiiilar iliooKh Miftiemlwt |r%n#'j v.inaisr-n 



THB NATURE AND TYPES OF INFORMAL DIAGNOSIS 49 

existed? pupils ranged from the average of third-graders to 
the average for seventh-graders, a variation of about four 
years. The difference between the averages of succeeding 
grades Is about one year. 


for the most part stiff ignore them in actual teaching, and 
treat classes as if they were largely homogeneous. Pupils 
frequently fall not because they lack the qualities or maturi¬ 
ties on which succ^ defKmds, but because their teachers are 
ob««ed with concepts of a class which never existed- These 
teachers do not consider that some students do not learn well 
from lectures or telling, that others obtain much more assist¬ 
ance in this way than from reading, and that stiff others profit 


difficulty in understanding or following directions, whereas 
others anticipate the suggt^stions which teachers offer. Most 
meml»rs of most classes successfully attain the objectives 
tff a particular unit or topic, yet every member of every claaS 
at one time or another experiences some type of learning 
difficulty, no matter what his ability or past achievement. 
It is for these reasons that a continuous program of diagnosis 
is so ImiKirtant, 

The problem of objectives. In a sense, tlie problem of 
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constitute a compromise between varying viewpoints witbm 
the ranks of educators themselves and the special inter^ts of 
outside pressure groups. 

Over a period of two decades, American education has 
developed many excellent statements of general objective. 
Beginning with the Cardinal Principles of Secondary Educa¬ 
tion in 1918, and clarified and augmented by statements of 
individual leaders, as well as of educational ©rganiations, 
the movement has culminated in a comprehensive statement * 
of the following ten social-economic goals of education, which 
have gained wide theoretical acceptance: 

1. Hereditary strength 

2. Physical security 

3. Participation in an evolving culture 

4. An active, flexible peraonality 

5. Suitable occupation 

6. Economic security 

7. Mental security 

8. Equality of opportunity 

9. Freedom 

10. Fair play 

However, acceptance of any or all of these general objec¬ 
tives * guarantees no particular desirable outcomes in any 
classroom, because many different programs can be developed 
from the same set of general objectives. Ckmeral objectives 
constitute only a frame of reference within w*’hich desirable 
educational programs must still be developed; a generaliza¬ 
tion has meaning only in terms of the specific elements on 
which it rests. 

General objectives must be analyzed into their specific 
elements and implications in order that pupil activity may 

» National Education Aflooclation Committee on Social-Economic ©f Amerfca. 

Tentative Report. Washington. D.C., 1933. Also '‘What Are D««lmhte SocM-Bcnoorak: 
Goals for America? ” in Journal of the Naiional Edueaiion Anmiatitm, Vot, ii, nn, d. (Jaft- 
uary. ip 34 .) 

* These general objectives were otulalned In some detail in the teniaUve reuo-rt. 
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be adequately directed toward their attainment, and thus 
should be a process of careful experimental derivation and 
not hurried armchair meditation. The resulting related 
specific objectives are the facts, knowledges, implications, 
appreciations, understandings, skills, and the like, which 
function in developing generalizations, principles, attitudes, 
ideals, and desirable conduct. They do not function as 
separate entities in any sp>ecific serial order in learning, but 
constitute elements of a total situation which may operate in 
many different ways to attain various ends. The absence of 
one or more of these elements in a learning situation may lead 
to confusion and block normal progress, hence the necessity 
of detecting learning difficulties in terms of these specific ele¬ 
ments. There is, therefore, no real conflict between facts and 
reasoning, knowledges and ideals, or subject-matter and con¬ 
duct. Difficulties arise from singling out one or more types 
of specific objectives and treating them only as separate ends 
in themselves. 

Deriving specific objectives. Many seventh-graders still 
study the “geography” of England from an outline somewhat 
like the following: 

1. Latitude and longitude 

2. Boundaries 

3. Land and water forms 

4. Climate 

5. Natural resources 

6. People 

7. Industries 

8. Products 

9. Cities 

10, Imports and exports 

The principal outcomes consist of more or less isolated and 
unrelated facts memorized in connection with each topic, and 
depend largely upon the accidental nature of available ma¬ 
terials and the 



52 DIAGNOSIS AND THE DIRECTION OF LEARNING 

If, however, in studying England the teacher €?statjlishes as 
one of the major objectives the history of England's develop 
ment as a great commercial nation, the elements of each of 
these topics relate to a common center of interest. Facts and 
information acquire meaning and functional value, and under¬ 
standings, appreciations, and attitudes are attained. 

But the number of potential specific elements related to 
this general aim is legion. Only relatively few are imixirtant 
and need be utilized. This does not limit learning but merely 
directs it; students will inevitably learn many other things. 
In the skills, much progress has been made in the determination 
of certain types of specific objectives. Most of the important 
words in reading, writing, and spelling, and the most essential 
steps in learning to read and master certain arithmetic pro¬ 
cesses, are known. In other fields, the problem of determining 
the most important specific functional elc'ments in attaining 
a major objective is just now being allacketl. 

Realizing the necessity for more d,etaik‘d guidance for ad¬ 
ministrators and teachers, the original statement of general 
Boxdabeconomic goals has been supplemented by a recent com¬ 
mittee report * which expands and clarifies tlie implications of 
the excellent tentative report. However, to be most effective 
in influencing learning, the principles laid down in each major 
field must still be analyzed until their implications serve as 
the objectives of the sf>ecific types of educational activity 
which must characterize classroom procedures. 

The following procedure is recommended for clarifying 
general aims and purposes and increasing the specificity of the 
elements through which these aims and purposes derive mean¬ 
ing and attainment. 

I. From the Sodal-F.ronomic Goals for .America and the siihs«|uent 
report on their implication for eduration, select the major aims or 
purposes of the course, subject, topic, or unit for genera! guidance, 

> ImpHealions of Social^Ecanomie far Edumtim, O.C.. N«ni»ital 

EduenUon Aaaoclatton. ipj” 
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2. Study each unit, section, topic, and activity and attempt to deter¬ 
mine what facts, implications, understandings, skills, and the like 
will function most in the attainment of these major aims and 
purposes. 

3. Compare both types of lists with colleagues and attempt to deter¬ 
mine the most important. When possible, subject such tests to 
experimental study, 

4. Utilize these objectives in selecting materials and procedures. 

5. Make major objectives the guiding, co-ordinating principles of 
learning situations in which specific objectives are attained as 
functional elements. 

6. Test for the attainment of specific objectives and modify specific 
objective lists from time to time in the light of 

a. The extent to which they are reasonable and attainable by the 
learners for whom they are intended. 
h. The extent to which they function in attaining major aims and 
purposes. 

c. The extent to which they are important in themselves and 
should be attained without reference to more general purposes. 

Informal objective tests. The contribution and ejffective- 
ness of informal objective tests in improving learning rests 
largely upon the skill and persistence of teachers in identifying 
and validating (see 6 above) these important specific objectives. 
Adequate informal objective tests are simply samplings of 
these objectives in test form. In so far as possible, test items 
are so stated that they will not confuse or mislead the student 
but will require definite unequivocal responses which may 
reveal such outcomes as a knowledge of facts and their implica¬ 
tions, certain skills, and ability to use facts and principles in 
solving problems, as well as appreciations, ideals, and attitudes. 

Such tests may save the time and energy of teachers if they 
are based on the objectives of a topic, unit, or course, and are 
given before related educational activities begin. The teacher 
in this way discovers w'hich objectives are already attained 
and those toward which educational activities should be 
directed. Additional tests sampling these unattained objec- 



54 DIAGNOSIS AND THE DIRECTION OF LEARNING 

tives follow from time to time until all objectives, theoretically, 
have been attained. In the process the« tests also identify 
the specific learning difficulties of students, so that activities 
can be modified to meet individual needs. Such tests may be 
given orally, written on a blackboard, or given In mimei^raphed 
form. They may be scored immediately with the help of 
students and the results tabulated. Thus they may provide 
for the teacher immediate guidance in the work of the day, 
and for each student a list of his own difficulties as a guide to 
his further efforts. 

Types of informal objective tests and items. Objective 
tests may take many different forms. Basic forms, typified 
by true-false, multiple-choice, and completion tests, tC4»ether 
with their variations, number several dozens. However, more 
important than mere test forms are the different types of out¬ 
comes which test items are intended to measure. Some of 
these items test the knowledge of facts primarily. It is often 
important to determine whether lack of knowktdge of ha.sic 
facts, failure to grasp their implications, or inability to use 
facts is the primary cause of failure to arrive at proper general¬ 
izations, principles, or conclusions. The functionality of a 
fact is not, however, dependent upon the fart as such, but 
upon the situation in which it is identified, learned, and ac¬ 
quires meaning. Other items test implications rather than 
the facts themselves. Still other items require only the con¬ 
clusions of a process of thinking which must precede and in 
which such factors as a knowledge of facts, their implications, 
and their relationship to other facts and inferences play a pari. 
It is by discriminating between these types of items that 
reasoning difficulties can often be diagnosed. 

Other items detect interests, appreciations, ideals, and 
attitudes. Sometimes the responses to such items can be 
definitely scored as right or wrong. Often, howwer, they are 
included to detect how students feel about certain institutions, 
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movements, regulations, subjects, principles, generalizations, 
conduct, and the like, in order that educational activities may 
be directed to the attainment of the objectives deemed desir¬ 
able in a particular school, community, or state. 

The most important forms of objective tests, together with 
the types of outcomes which they identify, will now be pre¬ 
sented. 

THE TRUE-FALSE TEST 

The true-false test is the most often used of the various forms 
of informal objective tests. Many teachers know no other 
form. In its simplest form, the student examines each state¬ 
ment or test item and judges only whether it is true or false. 
Frequently, he must also judge whether the item can be both 
true and false, whether information which precedes the items 
reveals the truth or falsity of the statements, and so forth. 
The familiar yes-no, didn't say, and doubtful captions are mere 
variations of this form. The following are samples: 

Directions: Circle the T for items you believe to be true, the F for 
those you believe to be false, the N ior those you believe 
are neither true nor false, and the D for those whose 
truth or falsity cannot be determined from the data given 
or any other data. 

1. Lake Superior is the largest fresh-water lake in 

the world. T F N D 

2. A symphony concert is more interesting than 

a movie. T F N D 

3. It is better to lose a game than win by playing 

“dirty.” T F N D 

4. After the Revolutionary War, America was 
thrown upon her own economic resources, and 
it was difficult for her to make much progress 
or to compete with England in the textile field. 

England would not allow machinery or models 
of machines to be exported, and discouraged 
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her skilled mechanics from leaving their owtt 
country. In 1789, however, Samuel Slater, 
who had worked in an English factory, built the 
first cotton spinning mill in America. 

The textile industry developed slowly. WTtile 
the Embargo Act of 1S07, and the War of iSta 
helped, it was not until after 1865 that maw 
production really began- The United Stat« 
soon advanced to first rank among textile-pro¬ 
ducing nations, but recently foreign ann|»eti- 
tion has become an important factor. 
ct. The best title for the at>ov’e story is Samuel 
Slater. 

b. Slater built the first cotton gin in America. 

c. England adopted an unfriendly attitude 
toward America's textile industry. 

d. The United Statens no longer ranks fijret 
among textile-producing nations. 

e. By 1812, the United State was produciiig a 
large part of the world's cotton. 

The first item is factual; the second rcjveais interest and 
appredation; the third reveals an ideal or a 1 1 it tide. The fourth 
type reveals ability to get information and make infertfrices 
and generalizations. The second and third ty|»es of items 
are usually grouped together, and not mixed with «»ther tyjw, 
since a particular interest, appredation, ideal, tir atiitucit* 
may be neither right nor wrong. However, when mixed with 
otber types, they are usually scored separately. 

Some have objected to the true-false type of test because 
of the potentially harmful effects of wrong fomns and mis¬ 
statements. However, research reveals that this fear is 
groundless because of the attitude of the student when he is 
taking a test. 

True-false tests have been criticized because a student may 
get half the items right by marking them all true or false. 
This is a purely theoretical objection. No one seriously taking 


T F N D 
TEND 

T F N D 

T F N D 

T F K D 
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a. test would admit his dependence on this chance, since some 
items may be neither true nor false. Furthermore, this objec¬ 
tion assumes that the number of true and false items will 
always be equal in each test, which is poor testing technique 
because it encourages students to lean on this crutch and 
defeats the purpose of the test. However, most diagnostic 
tests are not used for marking purposes; hence there would 
be no point in attempting to “beat the test.*’ When true-false 
tests are used for marking, all students will guess on some of 
the items, and there is no way to avoid it; but the student who 
must depend upon guessing will almost invariably land at 
the foot of the class, where he belongs. Carefully prepared 
diagnostic tests do not utilize one form exclusively and often 
do not use true-false tests at all. 

Probably the greatest disadvantage of this form is the ease 
with which it may be abused. Trivial and meaningless items 
often receive as much attention as crucial items. Generaliza¬ 
tions not justified on the basis of evidence, items partially 
true and partially false, confusing double negatives, answer 
clues, and unsupported personal opinions of instructors 
find their way into tests without the realization of the test- 
maker. 


THE COMPLETION OR RECALL TEST 

This test is a favorite type and if well constructed may 
possess high validity. It usually takes one of the following 
forms: 

1. ............ is the President of the United States. 

2. Columbus came to America in. 

3. Palms grow well in California because. 


The recall type test is often criticized because it tests 
memory rather than reasoning or some other outcome. Test- 
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ing for a knowledge of important facts is legitimate and de¬ 
sirable in itself. It is the testing of incidental and indifferent 
facts, or testing factual outcomes to the exclusion of other 
outcomes or without reference to their implications or uses, 
that should be avoided. Again, the question as to whether 
an item tests memor>!' or reasoning primarily does not depend 
on the form in which the test is cast; there is no such sefmra- 
tion in the complex mental processes whose activity we sample 
through tests. What is primarily a memory test may involve 
reflective thinking for one student or reaarming for another, 
depending upon w'hether or not the student has experienced 
or "thought through" the problem previously. 

This is true even in the most complicated situations. Take, 
for example, the following involved problem: 

If a man out hunting saw a fwar thrt*c hundrtd yards to the east 
of him, and he walked three hundred yards north, shot straight 
south, and killed the l>ear, what w«>uld Ik: the mt®! prolxihle color 
of the bear? 

The answer is "White," and is derived in the following 
manner: The only place on the glolx; where the conditions 
named could result in the death of the bear wotikl be at the 
North Pole. The man, therefore, was three luindred yards 
from the Pole when he sighted the lK»ar. When he reat'fu*d 
the Pole, the only direction in which he could shoot and still 
hit something on the earth wms south. Having located the 
place geographically, it may be deduced that the only bear 
which could survive in this latitude is a polar l>ear, and polar 
bears are white. This conclusion may be reached ihrotigh 
reasoning, but once experienced, the answer "White" may 
persist as a pure memory fact. In the intere.sts of accuracy 
it should be pointed out that tmlar bear.s do not live m near 
the Pole, but the illu.stration is neverthek^ss valid. 

To the extent that an.swers depart from the one-word type 
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and can be stated in different ways by the use of synonyms, 
objectivity of scoring is destroyed, but the form still possesses 
value for diagnostic purposes. It is of distinct value in detect¬ 
ing shortcomings in important potentially functional knowledge 
and information. 


THE MULTIPLE-CHOICE TEST 

This test is also known as the best-answer and the multiple- 
response test. It may consist of a question, a statement, or a 
problem followed by several possible answers. Multiple- 
choice tests are frequently used following a problem, an exer¬ 
cise in reading, or the telling of a story. Some of the forms 
which it takes are presented here: 

1. Who wrote Gone With the Wind? 

(i) Lewis (2) Ade (3) Mitchell . 

2. The gangster was 

(i) hung (2) hanged . 

3. Dispute means 

(i) disturb (2) question (3) subdue 

(4) disagree . 

4. if you have a free night, which would you rather 

do? 

(l) attend a concert (2) read a book (3) 

work in a carpenter shop with tools . 

5. When an acquaintance calls attention to one of 
your faults, it is usually because 

(1) He is irritable. 

(2) He is jealous. 

(3) He is interested in you. 

(4) He wishes to humiliate you, ... 

6. Your cousin must be older, younger, or the same 
age as you. But he Is neither older nor younger. 

Therefore, 

(1) Your cousin is younger than you. 

(2) Your cousin is older than you. 

(3) Your cousin is the same age as you. . 
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This is one of the most useful tyf>es of objective tests, but 
is probably one of the most difficult for iieginners to construct. 
The first three samples illustrate items primarily factual in 
nature. Items Hke number 4 may be used to sample interests* 
values, or appreciations. Number 5 reveals attitudes and be¬ 
liefs. Humber 6 illustrates one tyi>e of reasoning. However, 
any problem situation may be stated and follow ed by a multi¬ 
ple choice of reactions. The choices wdiich follow may have 
but one correct responsei or all responses may be correct, but 
one preferred; or there may be a certain number of correct an¬ 
swers to be selected from those given. 

Multiple-choice tests must usually be given in written form, 
since it is difficult for students to rememlx^r the various possible 
answers to questions when the tests arc dictated. These! tests 
are difficult to make. For eKampie, there is a tendency to 
make the wrong responses so unrelated to the task that no 
real thinking or discrimination is required. Furthermore, 
as in other objective tests, there is constant danger of formu¬ 
lating items for w'hich none of the choices is the correct re¬ 
sponse. When all answers are correct but oiw is pn-iaunaljly 
best, mistaken personal opinions frefjucntly render tla* items 
valueless. For short diagnostic tests it is prs»bal»ly licqter 
for the beginner to avoid this form. h'or more taim- 
prehensive tests to be used for appraistil purfKmvs and de¬ 
veloped over a period of time, this form is of umiuestioiiable 
value. 


THE MATCHING TEST 

This is a special form of the multiple-choke test and is usc*- 
ful for testing a large number of similar outcomes with a small 
expenditure of time, energy, and materials. Among the many 
possibilities are the following; matching disc-overers with dis¬ 
coveries, inventors with inventions, names with locations, 
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men with movements, concepts with definitions, and authors 
with their works. The following, largely a fact-testing tech¬ 
nique, is a sample of one of these forms. It should be observed 
that some outcomes do not lend themselves to this treat¬ 
ment. 


On the blanks preceding the various works the student writes the 
number of the correct author: 

1. Twain .An American Doctor’s Odyssey 

2. Scott .The Outward Room 

3. Mitchell ...... It Can't Happen Here 

4. Shakespeare .Ivanhoe 

5. Lewis .Twelfth Night 

6. Homer .Huckleberry Finn 

7. Stevenson .Gone With the Wind 

8. Farson .Last of the Mohicans 

9. Cooper .The Way of a Transgressor 

10. London .Treasure Island 

.Iliad 

.Call of the Wild 


Care must be observed not to make this test too long be¬ 
cause the nature of the form makes the problem very complex. 
With fifteen pairs, for example, there are fifteen choices for 
the first item, fourteen choices for the second, and so forth. 
Again, several different types of items, singular and plural 
forms, and other answer clues must be guarded against or 
validity may be destroyed. 


LOCATION AND IDENTIFICATION TESTS 

These tests enable the tester to sample facts, understandings, 
and abilities with great saving in time. However, they usually 
involve drawings and other pictorial materials. Unless the 
materials can be furnished in the forms of outline maps in 
social studies, the skeleton in physiology, schematic views of 
plants in botany, and so forth, the tests cannot be utilized to 
















THE NATURE AND TYPES OF INFORMAL DIAGNOSIS 63 

advantage for informal diagnosis. Figures 5 and 6 are illus¬ 
trations. 


Test 3. Crustaceans 


Directions: In each pair of parentheses at the right of the page put 
the name of the part of the crayfish that has the same number on 
the drawing as the parentheses in which you write your answer. 



FIG. 6. IDENTIFICATION TEST 

From Blaisdell, Inslruclional Tests in Biology. Yonkcrs-on-Hudaon, World Book Company. 


CHRONOLOGICAL, LOGICAL, AND RANK-ORDER 

TESTS 


The following are samples of these tests: 


A. Arrange in correct chronological order by numbering i, 2, 3, 


etc. 

Articles of Confederation 

Battle of Bunker Hill . 

Declaration of Independence .. 

Paul Revere's Ride . 

B, Indicate the relative lengths of the waves of each of the follow¬ 
ing radiants, beginning with I for the longest: 

cosmic rays . 

X-rays . 

gamma rays . 

radiant heat . 

wireless waves .. 
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A. Comput'.’ the aa^wcr and write it on the line to the right: 

C*) 

2X6 .. 

w -MirLil, 

B. Tlirsf l»y*, Tom, Paul, and Jim, are on a ladder. Tom ia 

hirther ap il»fi Paul, and Jim i« farther up than Tom. Which 
l«.>y ill in ilw middle on the ladder? 

i\ A man drrirw i» go from I joa Angeles to New York. He may 
fiy. til'tve iii» iiHiomohile, or go by luoat or train. 

(1) How %t("ould th*? varkma meilaxls compare in time, assumtnf 
he made ihc trip as quickly as psmrible by each method? 

( 2 ) Wtial woulil lie the relative cost, assuming he kept expenses* 
at ilie mifiimum in each case? 

The test ilesnii used afcjove are excellent for diagnosis. 
H«m~ever, tvliro u^ed as part of a more comprehensive test for 
appr4i?*.'il pwri-xtses, they may lead to ccimpllcations because 
of the dillrrenre in marking praetkes among teachers, who 
give varyhig amointis cd rTedit, for the use of right processes, 
rvi'ii tlimiKh the answer t» wrong, and also vary the credit 
for right answ'crs if the wrong processes have lx«n introduced. 
Thi«. system cd m.irksfig, of course, destroys olijectivity of 
«'«ririg and valithfy of the test as an appraisal instrument. 
iAif appr.t!s»tl no |tartial crtjciits should be given. 

Artiong I lie many other types of tests used by individual 
te;«*lirr« are the follciwdng. The possible variations are almost 

j, MrpfMmtiim Drawing*, music, poetry, or ptm& are repro- 

diitMl ff'iMii ntt»mor|i. 

2, VmSndmrf 1 ml. Th» twt may lx used in the roultipk*choice or 
cfiic^tvord answer mxll form, but is freciucntly us«S as an oral 

ir*A with wry young children. 

Op'posim, dmii&ntir%, mnd ann/nfic#. Situations and concepts in 
I'witli wriwi and visual form am utilized to test attention, ol»erva- 
lifift- awl uiMlwstanding, These tests may be in 

i««!li|ile4*lic»ice or tiropler form, written or oral. 
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reJ^i&m or visual ability io the multiple-choice form. The sub¬ 
ject locates among the four choices one that is identical with the 
first drawing. Such a test would rarely be made by a teacher, 
but might prow of distinct value for diagnosis. 

7. SkM mis. Many tests in reading, writing, language, spelling, and 
arithnietic may claMed as skill tests. Four important skills 
related to reading are illustrated belowi 


Table of Contents 

t'haptera Pages 

I. The Vftstness ctf the Industry...... x 

S, Methods of Handling.... as 

3. Land and Watar Transport..... 40 

4. OiitMgn'a Stc^ckyards... 55 

5. The Kiiih.irs;o . 60 

A Reindeer Meat. 70 

7. Imported Meat*..... 77 


A On what |^gedoes**Mcthad*of Handling” begin? ...... 6 

7. Which of these stories i« on page 75? 

I Reindeer Meat. 

3 The Embargo. 

3 lifiport«l Meat, . 7 

Underline the -correct answer. Write its number 
-on the line to the right. 


*, The introduction i® found In what part of the book? 


1 beginning a middle 

3 end . ... 

.. i 

a. The appendix t» found in what part df the book? 

* beginning a middle 3 end .... 

.. 2 

3, A glowwry oorttaln#; 
t index a definitions 

3 picture# . .. . 

.. 3 


Uwk at this index and find the answer# to qu^tlons 8,9, and ta 
INDEX 

Bananas; In AfrioiL, 345 *. l« Central America, 331 i 
in China, 399; in Colombia, 252; importance of, 
m food, 231; in Mexioo, aaSj in the Philippines, 
ad«| in the Wmt Indie*, 334; Bangtofc, dre 
” Venice of Ad&t"* 39*, 3 ^ 3 * 

Barcckwa, 336. 

fterge Canal, a# trade and trawl route, iia. 
flariry? In Afghanistan-, SHi i« Africa, 359: in 
Ak»fcii, 2045 in Arabtii, $ 79 % In Bulgaria, 33 »; 
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in Chile, 264; in Czecbo-Skwafeia, 317; in I->en> 
mark, 307; in Finland, agfj; in Nrw Zralaml, 

417; in Norway, 292; in Poland, 316; in Swerlen, 

394; Barranquina, 352. 

8. Information concerning Barcelona will Im' found on 

what i^ge? 8 

9. Information conot'rning bananas in iVlescim will be 

found on what page? ...... 9 

10. Information concerning barley in Fh^ntwirk will be 

found on what fsage? ...... lO 


Underline the him Wst topics to look up in an encydoj^ia 
or reference book for information on the following subject*. 
Write the iwo numbers on the line to the right. 


II. Raising Sheep in Australia 

1 Australia a Animal Husbandry 3 Sheep 

4 Farming 5 Meat 

13 . Rubber Plantations In Brazil 

r Automobile Tires s Plantations 3 Rubber 
4 Brazil 5 .South America 

13. Communication by Radio 

I Telephone 2 Radio 3 C'ables 

4 Inventors 5 Wireless 

14. Destructive Insects in the t'jtrus Industry 

I Horticulture 2 Industry 3 <*iirus Fruit 

4 Destruction S Pests 

15. The Rastjball Game in America 

I Schools 2 Big la-ague 3 Bam:*bail 

4 Games 5 America 


.It 


. 12 


.13 


.14 




Diagnostic rating scales. In addition to the various types 
of tests, othei- informal procedures have been deveiofied for 
the purpose of diagnosing learning difficulties. Traditional 
rating scales were develofied primarily for appraisal purposes, 
that is, for the purposes of determining the total or magnitude 
of achievement at the end of a course, semester, or quarter, 
and not for the purpose of detecting learning difficulties and 
the direction of learning. However, the concepts rattdi were 
frequently so vague and general that the ratings resembM 
the scores on essay examinations when different teachers 
marked the same paper. Wide differences of opinion fre* 
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quently resulted. When such general concepts as personality, 
skill, or ability are rated, the results are almost certain to 
possess low reliability. When, on the other hand, such con¬ 
cepts are broken down and analyzed into their constituent ele¬ 
ments, or defined, reliability is increased. The following illus¬ 
tration presents part of an attempt to define certain points,* 
in the range of manifestation of a trait or ability which is 
represented by the length of the line. Such a device is known 
as the graphic rating scale. Reliability of rating was raised 
very markedly with this device. 


I. Disposition _L 



Unusually opti¬ 
mistic; animated; 
cheerful 

Ordinary; gener¬ 
ally in good humor 

Dejected; 

melancholic 

2. Character 

1 1 

I 

1 1 

Irreproachable; 
highest ideals 
and conduct 

Conforms; no 
special strength 
or weakness 

Weak, 

unstable 

3. Mentality 

I 1 

1 

1 I 

Brilliant, unusual 

1 1 

Average; sufficient 
for ordinary needs 

1 

Stupid; dull 

1 1 

4. Tact 

Meets difficult 
situations frankly, 
adequately, and 
without offense 

1 I 

Meets ordinary 
situations satis¬ 
factorily 

_I_ 

Cannot get 
along; many 
antagonisms 

J_1_ 

5. Skill 

Expert; resource¬ 
ful; utilizes in¬ 
stinctive tenden¬ 
cies; makes objec¬ 
tives attractive; 
maximum results 

Average; pupils No pupil par- 

make some contri- ticipation; 

bution; many work teacher con- 
independently ducts and 

with fair success controls ;work 

mechanical; 
pupils help¬ 
less; no results 


' See Tiegs, E. W. AnEmlualim of Some Techniguts of Teachtr Selection. Bloomington, 
Public School Publishing Company. is)a8. 



VIII. ALIGNMENT CHART — GRADES THREE AND FOUR 

Correct alignment. Correct alignment can be secured if the paper is held correctly and shifted to 
the left often enough to keep the writing directly in line with your eyes. 



fig. 8 . NYSTROM DIAGNOSTIC WRITING SCALE, GRADES 3 AND 4 



Wrilins below Ike line. Writing below the line is usually caused by slanting the paper too much. 



FIG, 8. NYSTROM DIAGNOSTIC WRITING SCALE, GRADES 3 AND 4 


72 


diagnosis and the direction of learning 

pj,Qpgj.jy Qonstructcd, rs-ting sc3.1gs 3 .r6 iis 0 fiil in iDOth dicignosis 
and appraisal, since certain outcomes do not yet, and may 
never, lend themselves easily to new-type tests of the pencil- 
and-paper variety. Systematic observation and record¬ 
keeping must be used, temporarily at least- Such outcomes 
as interests, attitudes, and ideals can be sampled through 
pencil-and-paper tests, but the teacher learns much through 
systematic observation and record-keeping as well. Conduct 
in the sense of social and personal adjustment will also yield 
to rating to the extent that these terms are analyzed into 
significant observable elements, and carefully recorded obser¬ 
vations made. 

Concepts like creativeness, initiative, and originality have 
been a source of great confusion in education during the past 
decade. Fantastic claims for their development have been 
advanced for a variety of apparently new procedures. Many 
teachers who have been led to accept allegations as if they 
were proven facts have imitated these procedures in the belief 
that alleged outcomes were sure to follow. First attempts to 
define these concepts usually result in the production of lists 
of synonyms as vague and general for rating purposes as the 
original concepts. Teachers are often hostile to the idea that 
before such outcomes can be evaluated they must be treated 
quantitatively. However, when the specific samples of con¬ 
duct which are acceptable to teachers as evidences of such an 
outcome as originality are listed, and observation reveals that 
one of two students exhibits five while another exhibits twelve, 
the possibilities of more adequate diagnosis and appraisal 
become more evident. 

At present we regularly use many concepts the meanings of 
which are not at all clear or universal, and for which we can¬ 
not, therefore, have adequate methods of diagnosis or measure¬ 
ment. We must, therefore, attempt to break these concepts 
into their cruder constituent elements, or simply agree that 
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certain ascertainable facts will be accepted as evidence of the 
presence of the traits or abilities concerned. 

Diagnostic product scales. Some outcomes of learning 
lend themselves to diagnosis through diagnostic product scales. 
Writing, drawing, shorthand, and certain aspects of shop- 
work, agriculture, and home economics are among the fields 
of usefulness of such instruments. For example, in home 
economics variations in ability to sew seams may be represented 
by actual samples from the best to poorest, and specific defects 
pointed out. Students, by comparing their work with the 
scale, learn what they should strive to correct. Figure 8 
presents a diagnostic product scale in writing. Students 
compare their work with the scale and have their future 
efforts guided by exercises to correct their difficulties. 

Standardized tests and informal diagnosis. Standardized 
diagnostic tests contain a variety of test forms. Research 
usually contributes to the careful selection of the most crucial 
and significant items. At the same time, incidentally, they 
reveal the extent to which individuals and classes are attaining 
the objectives represented by the test items. 

Teachers frequently have little or no training in diagnostic 
testing, and their assigned teaching programs are so arduous 
that they find little time to develop proper diagnostic tests. 
Standardized diagnostic tests contain many types of items 
which may be used as samples in constructing informal tests. 
Imitation of these items in terms of the objectives of specific 
topics, units, or courses will contribute to better diagnosis 
and to the improvement of learning. The construction and 
uses of diagnostic procedures will be treated more fully in 
Chapter Four which follows. The diagnostic functions of 
standardized tests will receive attention in Chapters Five to 
Ten inclusive. 



74 DIAGNOSIS AND THE DIRECTION OF LEARNING 


QUESTIONS AND EXERCISES 

1. Distinguish between achievement diagnostic testing and diagnosis 
of mental maturity. 

2. What is mceint by individual differences, and of what significance 
is this concept in improving lezirning? 

3. Contrast general and specific objectives; explain the significance 
of each in improving learning. 

4. Explain and illustrate the most important forms of informal objec¬ 
tive tests. 

5. In what way are different types of test items related to different 
types of outcomes? Give examples of different types of test items. 

6. What are the principal difficulties in developing diagnostic rating 
scales? In what manner may they be overcome? 

7. Explain what is meant by diagnostic product scales. Give examples. 

8. How may the use of standardized diagnostic tests aid in improving 
informal diagnosis? 
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CHAPTER FOUR 


CONSTRUCTING AND USING 
DIAGNOSTIC PROCEDURES 


The usefulness of diagnostic procedures is dependent not only 
upon skill in formulating specific test items, and in construct¬ 
ing complete diagnostic tests, scales, or charts, but upon other 
factors as well. Although these factors are important, the 
techniques involved are relatively simple and can easily be 
learned. The more or less specific test and scale items which 
are used in diagnosing learning and adjustment difficulties 
are based upon specific elenient.s of larger learning situations 
or patterns. Difierences arise in determining which of these 
specific elements, or objectives, are important or crucial and 
should receive attention, and in determining exactly how they 
function in learning and adjustment. It will be profitable to 
consider these problems before dealing with the more obvious 
aspects of test and test-item corislruclion. 

Evolution and differentiation of objectives. The mastery 
of skills and subject-matter as .such was perhaps the major 
objective of early American education, and still persists in 
modified form, especially on the fHK:ond:iry and college levels. 
It has, however, fallen into disrepute with many teachers and 
educational leaders on all levels becau.se the attainment of 
such an objective does not result in intelligent and significant 
participation in the business of living. Other teachers have 
moved from this position to an emphasis on meanings and im¬ 
plications of facts, and abilities related to using them in solv¬ 
ing academic problems. Still others emphasize the necessity 
of developing learning activities in terms of problems and 
experiences which are real tt) children and youths themselves. 
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in which adequate adjustment to these immediate problems 
and tensions is paramount, and in which such more or less 
specific elements as facts, information, appreciations, ideals, 
and skills are significant only as they function in adequate 
adjustment.^ Between these three positions are many inter¬ 
mediate differentiations, extending from the most traditional 
to the most radical concepts of the exact goals toward which 
educational activities should be directed. 

The specific elements which a teacher may select as the 
basis for a diagnostic test will depend, therefore, at least 
partially, upon the particular concepts held, and the particular 
objectives or goals deemed important. To some, the term 
“objective” means facts and information only; others believe 
that appreciations, ideals, attitudes, and adjustment or con¬ 
duct patterns are just as truly important and direct objectives 
of the curriculum. 

Thus in any one community or in a single school, the objec¬ 
tive tests which teachers construct and use often reveal a 
bewildering variation of form and content. In other situa¬ 
tions all may use a single stilted form. Critics of diagnostic 
testing often offer these situations as evidence of the failure or 
futility of testing itself. In reality failure of testing arises 
largely from the conflict among teachers themselves concern¬ 
ing the legitimate functions of objectives. 

The functionality of objectives. The desirability of analyz¬ 
ing major aims and purposes into implications sufficiently spe¬ 
cific to serve in the direction of pupil activities has already been 
presented in Chapter Three. Even the most radical thinkers 
agree on the necessity for some such analysis, because general 
aims and purposes like the Social-Economic Goals for America 
are too remote to provide the immediate problems and intrinsic 
tensions which they hold to be essential in learning and 
adjustment. 

* This is one of the fundamental principles of organismic psychology, the basis on which 
much of the so-called Progressive Movement has developed. 
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Conflict arises in determining- what degree of specificity or 
simplicity is desirable, and how this specificity affects the 
functionality of these elements. Some psychologists have 
insisted that such relatively specific objectives or elements as 
facts, appreciations, skills, ideals, and attitudes are signifi¬ 
cant in learning and adjustment only as they are attained as 
character members or elements of larger patterns or situations. 

The problem is complicated by the fact that wide differ¬ 
ences exist between learners in their capacity to perceive or 
comprehend wholes, or more general concepts, and that rela¬ 
tively specific elements for some learners may constitute 
relatively whole experiences for others. Again, specific ele¬ 
ments become character members of several overlapping and 
related wholes. 

Many others hold that this organismic concept is only a 
partial explanation of learning; that learning and adjustment 
are facilitated by synthesis as well as by analysis; and that 
certain specific elements or objectives in terms of facts, skills, 
habits, and the like are significant in themselves, and can and 
should be attained directly without reference to more general 
aims and purposes, or larger wholes, or experience, patterns, 
particularly after the organismic approach has apparently 
failed. They hold also that the immediacy of intrinsic moti¬ 
vation stressed by the new psychology is a relative matter, 
because the better integrated and adjusted the individual 
becomes and the greater his maturity, the more increasingly 
remote are the goals which will serve as the necessary tensions 
and problems. 

The most serious implications for the use of diagnostic and 
measurement procedures in improving learning and adjust¬ 
ment arise from certain assumptions which have been widely 
publicized and which may be summarized as follows; 

1 . That since the iteni.s of diagnostic tests deal with specific objec¬ 
tives or elements of larger situations, those who use tests must 
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believe that whatever is true of these elements in isolation is true 
of them in any complex functional whole; and 

2 . That success on such tests is not evidence of true learning or growth 
in adequate adjustment patterns, but only constitutes proof that 
the learning which preceded was mechanical, fragmentary, or 
atomistic and that specific facts, skills, habits, and so forth were 
attained in some serial order. 

Neither of these assumptions is true. The first is largely 
meaningless. No one knows or cares much what might be 
true of an element in isolation; a basic assumption of the critics 
themselves is that all elements are perceived as parts of larger 
patterns. The nature or wholeness of the pattern depends 
largely on the perceptual ability of the learner. The types of 
objectives or elements which are represented by items of 
objective tests, therefore, do not exist in isolation, even though 
the larger patterns of which they are parts may vary in nature, 
complexity, and significance. 

The second assumption is equally erroneous. It is just as 
reasonable to assert that the specific tests which a physician 
makes do not constitute evidence of health or disease or the 
functional well-being of the individual and are of no value in 
guiding treatment, as to hold that a sampling of specific ele¬ 
ments is of no value in judging the status of learning and 
adjustment and in guiding learning activities which constitute 
or lead to adequate adjustment. 

No one knows exactly how specific elements function in any 
given learning or adjustment situation. The defender of a 
new theory is often forced to admit that he learned by proced¬ 
ures which violated every tenet of his new faith. Children 
and youths have always learned and adjusted in many dif¬ 
ferent ways, right and wrong, from the standpoint of any 
particular school of psychology. The integrative action of 
the self-unifying organism itself appears to be far more 
significant in learning and adjustment than the more or less 
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accidental manner in which conflicting curriculum theories 
and practices attempt to control the student’s approach to the 
world of reality. Our only adequate knowledge of many 
types of learning and adjustment must come from the specific 
reactions of the functioning organism, either through the 
medium of responses to test items or through carefully recorded 
observations of adjustive behavior. 

Constructing diagnostic tests. The Ideal time to construct 
diagnostic tests is before the learning activities related to a 
particular topic, unit, course, or subject are begun. This 
necessitates a careful determination of major aims and pur¬ 
poses and their more crucial elements or objectives. The 
particular elements selected as the basis of specific test items 
will depend largely upon the psychological beliefs and convic¬ 
tions of the teacher. Some teachers will stress fact and infor¬ 
mation items with no particular organization. Others will 
stress items which constitute problem situations. Still others 
will select and organize specific test items around major aims 
in a manner which reflects their ideas of how these items func¬ 
tion in learning and adjustment. 

An important consideration is the fact that a large \'ariety 
of test-item forms are available from which any teacher may 
select those forms w'hich aid in identifvirg the elements con¬ 
sidered most important. 

The practice of making diagnostic tests ahead of time in 
no w^ay limits learning and adjustment. It assures preparation 
by the teacher in selecting appropriate materials and proced¬ 
ures, and aids in a better direction of activities in attaining 
the goals selected. If additional elements appear important 
as learning activities proceed, they may simply be incorporated 
as new test items. Many valuable concomitant learnings in¬ 
variably occur W'hich require no special identification. The 
purpose of diagnostic testing is to furnish continuous specific 
information in order that learning activities may be most 
productive of desirable outcomes. 
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General difficulties in test-making- It is well to realize 
that a completely satisfactory diagnostic test has never been 
made. Test items which have previously served their purpose 
well suddenly appear inadequate. Ambiguities and obscurities 
are bound to occur, especially in the efforts of beginners, and 
incidental rather than crucial objectives receive attention. 
Instructors, sometimes, unconsciously force students to choose 
between their pet biases and conflicting objective evidence, or 
find themselves making tests on the objectives they happened 
to stress, rather than on those which should have been deter¬ 
mined beforehand for guiding educational activity. Sentences 
will be lifted bodily out of the contexts of materials read, and 
thus often lose a part of their significance, or completely mis¬ 
lead the student. Too many items will sample one type of 
objective to the exclusion of others or teachers will keep careful 
written records of elements of behavior patterns which later 
prove to have no diagnostic value. But progress is made by 
discovering what is valueless as well as what is significant. 
Above all, test-makers must not attempt to sample every pos¬ 
sible outcome even in comprehensive objective tests. Well- 
balanced samplings of important and crucial objectives only 
are necessary. The remainder of the chapter will be devoted 
to these construction problems. 

Constructing true-false tests. Major difficulties in con¬ 
structing the true-false test are the following; 

1. Confusion in instructions in both oral and written forms. 

2. Balancing the numbers of true and false items in each test. 

3. Making statements which are partially true and partially false, 
or neither true nor false, or indeterminate. 

4. Using too long and involved statements. 

5. Using words or expressions which serve as clues to correct re¬ 
sponses. 

6. Using too many negatives, double and single. 

7. Utilizing technical or obscure language. 

8. Stating personal opinions or making broad, meaningless generaliz- 
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Several different types of responses in the same true-false 
test are often used in the same class. The + and o, the + and 
—, the yes-no, and the T and F are among them. Some of 
these are illogical and others lead to cheating. The writer 
recommends the insistence upon one form in both oral and 
written tests, the T and F. In the written form, these symbols 
should be placed at the right rather than the left of the line and 
students instructed to underline or circle the correct answer. 
The fact that the T’s come first in each case in the mimeo¬ 
graphed or printed form has no effect on anfiwers. 

If presumably true and false items are equal in number each 
time, this fact will influence the student in answering, and the 
purpose of the test will be partially defeated. A mechanical 
order in the placement of T and F responses should be avoided. 
Table 8 shows the changes in answers in a class of forty-two 
students. No information was given until the test was com¬ 
pleted. Students were allowed to change as many items as 
they desired in a sixty-item test, after they were told that the 
true and false items were equal in number. 

Table 8. Number of vStudext.s CiiANOiNtj 
Answers in a Trite-False Test 


Number ok 
Items 
Chanc.ei) 

Number ok 
.STiriJENTS 
ClIANOIN'G 
Iti'ms 

9 

I 

8 

2 

7 

I 

6 

o 

5 

lO 

4 

8 

3 

I 

2 

5 

I 

4 

O 

lO 


It is w*ell to instruct students who are taking printed or mim¬ 
eographed tests to indicate items which are ambiguous, par¬ 
ticularly those which appear to be rieither absolutely true or 
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false, nor mostly true or false. Many of these comments will 
simply reveal lack of achievement on the part of the student; 
others will serve as the basis of clarification or elimination of 
the item. 

Students find many answer clues in true-false statements. 
Weidemann ^ and Brinkmeier * are among those who have 
investigated this problem. Brinkmeier found that after state¬ 
ments exceeded approximately fifteen words they were increas¬ 
ingly true. Table 9 reveals typical clues in an analysis of over 
ten thousand true-false statements. 

Table 9. Clues in Answering True-False Statements 
(Adapted from Brinkmeier) 


Total Items Contaiisting 

Number of 
Times True 

I. Totally, entirely, etc., or only and 
alone . 

I in 10 

I in 5 

I in 4 

1 in 3 

2 in 3 

3 in 4 

4 in 5 

5 in 6 

2. All, or no, none, etc. 

3. Alviays or never . 

4. Causes or reasons. 

5. Should . 

6. Almost all, as a rtde, generally, etc. 

7. Many or often, frequently, some, 

seldom, few, any, etc. 

8 . Citations to definite dates, words, 
periods, etc. 


Each item of a test should be examined for negatives which 
introduce unnecessary difficulties and for technical or obscure 
language, or personal opinions of the test-maker. Such items 
lead to confusion, waste the time of the class, and defeat the 
purpose of the test. Where only a few items are given orally 
as the point of departure in a class discussion, these objections 
are not so serious. In such cases students should be given the 
opportunity to point out the objectionable features of unsatis¬ 
factory items, and the objective involved should be clarified- 

' Weidemann, C. C. How to Construct a True-False Examination. Teachera College 
Contiibutions to Education, no. 22s. New York, Columbia University. 1926, 

=■ Brinkmeier, 1 . 11 . A Study of the Technique of Constructing Objective Rxamlnalions 
University of California, Ph.D. Thesis, 1929. 
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Constructiiig recall tests. The recall or completion test is 
not a mere test of memory, but may require complex reasoning. 
It is extremely valuable as a point of departure in teaching. 
To the extent that items are drafted so that personal opinion 
is eliminated in scoring, it possesses high validity. For short 
diagnostic tests which are not scored, this factor is not impor¬ 
tant. Typical errors in making recall tests include the fol¬ 
lowing: 

1. Testing incidental rather than crucial objectives. 

2. Using unniodified statements from textixxjks and articlas, which 
are dei>endent on the context for meaning. 

3. Making too many blanks in statements. 

4. Failing to autetitute blanks for the key ideas. 

5. Including Items that call for tcK> long answers. 

6. Using dots or varying the lengths of blanks to suggest the niimf »cr 
of letters in the correct answer. 

In. constructing the recall as well as other tests, the objectives 
of the assignment, topic, or unit to l>e testetl should first be 
reviewed. Depending upon the field and the progress of the 
class, the tester should make clear-cut, significant statements 
covering each important objective. Key words should then 
be crossed out and blanks substituted. Blanks should be the 
same length so that they do not serve as answer clues. As a 
rule, each statement should contain only one crucial idea and 
one blank. 

After constructing the test it is well to write all the answers 
that would be accepted for any item. Ambiguous items or 
those which admit of too many or a variety of answers may 
then be rephrased. 

Constructing the multiple-choice test. This test should be 
given in written form. While the multiple-choice test is val¬ 
uable in comprehensive standardized tests which will be used 
several times, the time necessary to construct good items 
sometimes makes its value questionable for brief tests which 
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will be used but once. Frequently only two or three choices 
can be made to appear plausible, but this is satisfactory for 
informal diagnosis. Where the best-answer form is used, all 
answers being correct, all items should be tried out before being 
used, since it is often difficult to judge the relative merits of 
similar responses. For the most part, however, the confusion 
items or distracters do not distract, but are too obviously 
wrong. Frequently in the struggle to make all items plausible 
the item intended for the answer is not correct. 

The following steps are suggested in constructing multiple- 
choice items: 

1. Review the range of objectives to be tested in connection with 
the as.signment, topic, or unit of work. 

2. Make significant correct statements covering the important or 
crucial objectives, 

3. Consider the possible misconceptions which students might obtain 
or errors which they might commit in connection with each objec¬ 
tive to be tested, and write from one to four additional statements 
containing these possible errors. 

4. Rewrite the item containing the major statement and the choices. 

5. Vary the position of the correct choices. 

6. Before using the test, search each item for answer clues In the 
form of tense, number, parts of speech, and so forth. Be sure 
that the correct answ^ers are correct, and that other choices are 
I 3 lausil)le. 

This test form may be used in a great variety of situations, 
and include a variety of types, among them the following: 

1. Explanation or proof type: A statement asserts that something is 
true or desirable; several possible reasons are offered, only one 
of which may be best. 

2. Inference type: A statement presents a given set of conditions. 
A number of possible inferences which may be drawn follow; only 
one may be correct or may be best. 

3. Problem type: A problem is stated, and several plausible state¬ 
ments of the solution are given. 

4 Translation type: A statement in one language is followed by 
several possible translations in another. 
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5. Spelling type: Words are spelled in several different ways; students 
select the correct spelling. 

6. Vocabulary type: Words are presented followed by several possible 
meanings; students select the correct meaning. 

The form of the test is also important. Choices should not 
be strung along indiscriminately in the body of the item. Fre¬ 
quently it is difficult to tell where the last choice ends and the 
remainder of the statement begins. Choices should be num¬ 
bered or lettered for identification. For comprehensive tests, 
blanks to the right of each item on which the number or letter 
of each choice may be indicated aids greatly in scoring. If 
students are asked to check or underline choices as well, addi¬ 
tional data are provided for diagnostic purposes. 

Constructing matching tests. This form is less useful in 
day-to-day diagnostic testing than the true-false or recall form, 
but is more useful for pre-tests, or diagnostic tests of large 
topics or units. In reality a special form of the multiple- 
choice test, it is a time-saving form in which many objectives 
of a similar nature are to be sampled and significant obstacles 
to learning identified. Major difficulties in constructing 
matching tests are as follows: making tests too short or too 
long; making short tests with the same number of answers as 
test elements; mixing unlike data which serve as answer clues, 
such as singulars and plurals, and elements preceded by a, an, 
or the. The following suggestions will be helpful; 

1. Tests should contain at least ten items where possible, but rarely 
more than fifteen. 

2. Before deciding to use this form, be sure that there are a sufficient 
number of similar or comparable objectives to be diagnosed or 
tested. 

3 - As a rule, unless the test contains twelve or fifteen items, include 
a few extra items in possible choices which illustrate errors stu¬ 
dents often make, or common misconceptions. 

4- Avoid any mechanical plan in relating items in the two parts of 
the test. 
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5. Avoid differences in data, such as number, dates, or person, which 
serve as answer clues. 

The matching form is useful in such fields as English, where 
authors and their contributions may be matched; or in social 
studies, where concepts or principles may be matched with 
their originators; or in the field of science, where chemicals 
may be matched with their reactions, and scientists may be 
matched with their discoveries. Also, products may be 
matched with areas in which produced, and important events 
matched with dates or places. The matching form may also 
be used in tabular form as well as in the familiar two-column 
form. In the former case, a column of test items down the 
left of a page may be related to a number of items across the 
top by checking in the appropriate column or columns. This 
form will be most useful if the data are comparable or similar. 

Constructing other test forms. True-false, recall, multiple- 
choice, and matching tests represent the most-used forms in 
diagnosis as well as in appraisal. They illustrate the essential 
principles of all tests. In fact, some widely used commercially 
distributed tests use no other forms, and some are developed 
almost entirely on one of the first three. 

However, these principles may be utilized in many other 
forms. Outline maps in geography and history, schematic 
views of trees and flowers in botany, drawings of the body in 
physiology, may now be drawn and mimeographed so easily 
that many schools use them in identification and map-location 
tests. They are purely objective, easy to give and score, and 
make it possible to test many significant objectives with a min¬ 
imum expenditure of time. In the same way, mimeographed 
tests of English usage may be utilized to test the ability of 
pupils to identify and correct errors in English usage. If the 
latter form is used as a more or less comprehensive test for 
appraisal purposes, it is advisable to have only one error in 
each line of copy, have students underline errors, and write the 
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correct forms on blanks at the right. This method aids greatly 
in scoring. 

Diagnosing inadequate oral learning and adjustment. Be¬ 
sides the numerous paper-and-pencil tests, the teacher is con¬ 
cerned with the diagnosis of certain oral aspects of learning. 
These include certain outcomes in speech and music and are 
essentially individual in nature. In constructing tests of oral 
English, care should be observed in including essential ele¬ 
ments. The copy which students read should contain all ele¬ 
ments which are likely to cause trouble. These include certain 
beginning and ending sounds, difficulties of enunciation of 
words of several syllables, especially similar words; the proper 
values for vowels; pronunciation; good voice placement; and 
modulation. These and other elements should be carefully 
tested by the use of selections from regular work, as well as by 
the use of samples illustrating special difficulties, which teach¬ 
ers may construct. Such difficulties may be put in the form 
of a check list across the top of the page with names of students 
down the left side. As each student's difficulties are checked 
the teacher has an automatic diagnosis of the class as a whole. 
It may also serve as a record of progress, records of individual 
pupils being cleared as difficulties are overcome. 

In the same manner, diagnosis of certain oral aspects of 
music should be specific and systematic. Such elements as 
pronunciation, enunciation, breathing, syllables (where used), 
note values, and tone quality should be singled out for definite 
examination in the case of all pupils not making normal 
progress. A part or whole song may be sung, while the teacher 
checks carefully the exact nature of the deficiencies. This 
record then serves as the point of departure in a rational 
remedial program. 

Product diagnosis. Certain other outcomes do not lend 
themselves to the pencil-and-paper type of diagnosis even 
though they may be pencil-and-paper products. For example. 



CONSTRUCTING AND USING DIAGNOSTIC PROCEDURES 89 

it is difficult to tell from inspection whether or not a given 
sample of writing is satisfactory for a specific grade, and ex¬ 
actly what should be done if it is not. While the ideal is 
always a clear, legible hand, children, as well as adults, have 
certain limitations which must be recognized. Consequently, 
workers have collected samples of writing at succeeding periods 
of development and have selected typical samples to use as 
standards of expectancy- When students do not achieve such 
standards, it is necessary to determine, if possible, what spe¬ 
cific difficulties are responsible for this failure. Color, align¬ 
ment, beginning and ending strokes, and other elements must 
be judged. This may be done with or without diagnostic sam¬ 
ples illustrating the various difficulties. While such materials 
are available commercially, any teacher can collect samples of 
these difficulties for her particular grade or age group, and 
organize them into charts with remedial suggestions, and thus 
make it possible for pupils to aid in the diagnosis of their own 
difficulties. 

In the same manner, outcomes in sewing, shopwork, draw¬ 
ing, and other fields may be diagnosed. The first essential 
step is to obtain a sufficient number of samples of the outcomes 
to determine what is reasonable and attainable; the second 
step, when samples are available, is to identify as far as pos¬ 
sible the elements which serve as obstacles to normal progress. 
Students may aid by self-diagnosis. 

Personality and conduct diagnosis. Scales for rating and 
diagnosing personality defects and inadequate adjustment 
patterns must follow the principles just explained, but it is in 
this field that they have been most flagrantly violated. One 
teacher punishes what another ignores; another utilizes the 
same overt misbehavior in the attempt to diagnose and remove 
the causes. One major reason for this variability is a lack of 
knowledge of what constitutes a reasonable and attainable 
expectancy for children and youth at different stages of matu- 
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ration. Much research has been done in the field of intellectual 
and physical growth. Much less, relatively, has been done in 
the field of personality and adjustment. 

Another difficulty is the tendency of teachers to think of 
personality and adjustment as something wholly separated 
from facts, information, understandings, and skills. The 
learner is a self-unifying organism, and the specific elements 
which are sampled by diagnostic tests and scales are simply 
different aspects of this functioning organism. Data on these 
specific elements should be brought together in the attempt to 
obtain a balanced picture of the individual. 

Adjustment standards. The first step in diagnosis is to 
obtain a wide representative sampling of the trait or type of 
adjustment to be diagnosed in order to establish a reasonable 
and attainable standard. Traits or adjustment patterns must 
be carefully defined because so much vagueness exists in this 
field. In the case of punctuality, for example, which most 
teachers think of vaguely as being-on-time, the various differ¬ 
ent acts in which children and youth engage and in which they 
may be tardy must be considered. These include such factors 
as handing in work on time, getting to school on time, and go¬ 
ing from class to class on time. To discover, for example, 
reasonable and attainable standards for getting to school on 
time a wide sampling of attendance must be obtained from 
which may be determined how often, on the average, children 
in general are tardy during a month, semester, or year. A 
similar determination should be made for each of the other 
aspects of punctuality. Eventually, it becomes clear that 
punctuality cannot exist except in connection with certain 
specific acts, has no significance or meaning apart from them, 
and may vary under changing circumstances. The establish¬ 
ment of reasonable and attainable standards of punctuality for 
various specific acts constitutes the basis for evaluating the 
adequacy of this type of conduct or adjustment. 
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Using personality data. Originally teachers were satisfied 
merely to measure and mark pupils. Now they attempt to 
discover and remedy the causes of unsatisfactory achievement. 
For many years, they were satisfied to mark conduct as if they 
had no other responsibility. Recently, they have begun to 
accept responsibility for diagnosing the causes of unsatisfactory 
personality development and adjustment as the basis for guid¬ 
ing students toward better adjustment. 

Where habit or conduct falls below reasonable expectation 
in any particular, it is diagnosed in terms of the causes, and as 
far as possible these causes are removed. In the same manner, 
such well-known generalities as co-operation, honesty, cour¬ 
tesy, self-direction, orderliness, and participation, among 
others which appear on report cards, must be analyzed and 
studied in terms of their specific manifestations in overt acts. 
Their appearance on report cards calls attention to the desir¬ 
ability of learning to appraise and diagnose their adequacy. 
At present most teachers can only guess and estimate these 
qualities. Unfortunately, skills such as achievement in read¬ 
ing or arithmetic which can be measured with much greater 
accuracy have been eliminated from report cards. 

Recognizing the paramount importance of outcomes in terms 
of growth and development of personality, and the fact that 
these objectives have long been largely neglected, certain work¬ 
ers have attempted to shorten the program necessary to analyze 
and evaluate these elements. They have arbitrarily defined 
some of them, guessing at their different manifestations; or 
they have defined arbitrarily in terms of varying degrees of 
satisfactory manifestation. No individual or group can ‘^by 
taking thought” determine to what extent pupils should be 
self-directing, or orderly, or co-operative, any more than they 
can consider a series of intellectual or physical traits, and then 
say what pupils of varying ages shall do. The result of this 
technique is to confuse the issue, to “prove” to many that 
nothing can be done, and to delay any real progress. 
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In summary, progress in appraising and diagnosing person- 
ality and conduct may be made in the following manner: 

1. By breaking down through analysis and research each general 
concept or personality trait or observed adjustment pattern which 
is considered important, into elements whose presence or absence 
can be detected in a given reaction or act of the individual. 

2. By a study of these elements as they are exhibited by large groups 
of normal children, to determine what is normal and reasonable 
expectation to use as a standard. 

3. By a study of deviations from normal manifestation of these 
elements, to determine why they occur and to discover how to 
remove their causes. 

Research in personality and adjustment. At present, liter¬ 
ally hundreds of adjectives are used in the attempt to describe 
personality, conduct, and adjustment, and identical situations 
are appraised with widely varying results. Research already 
completed suggests the practical possibility of describing 
adjustment in terms of a few basic, definitely determinable 
factors; even a superficial analysis of some so-called traits 
raises serious doubts about their reality. For example, the 
boy who comes to school tardy may never fail to get assign¬ 
ments in on time, and once he is at school may never fail to go 
from class to class on time. The fact that no one at present 
knows just how specific elements function in adjustment does 
not make a valid identification of these elements less desirable; 
inability to identify the elements of particular acts of adjust¬ 
ment is at present a cause of great confusion. V/e must, there¬ 
fore, continue to set up the best theoretical learning situations 
of which we are capable in order to stimulate desirable growth 
and adjustment. When we have learned to describe ade¬ 
quately the complexities of important adjustment situations 
and to diagnose the causes of the presence or absence of crucial 
specific elements, we can not only provide better remedial 
programs, but can also improve the whole program in the inter¬ 
ests of personality growth and development. 
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Time factor and diagnostic procedures. Short diagnostic 
tests of five to ten items can be given successfully in oral form, 
and the results utilized immediately as the point of departure 
in teaching and learning. Pre-tests of large topics or units and 
diagnostic tests covering a similarly large number of objectives 
should be mimeographed, if possible. The amount of time 
necessary to test for a large number of significant objectives is 
remarkably small. Early investigators reported ten to fifteen 
true-false items, eight to ten multiple-choice, and six to eight 
recall items per minute for adults. Table 10 reveals actual 
experiences in secondary schools. 


Table 10. Number of Test Items per Minute 
IN Secondary-School Objective Tests 


Type oe Test 

Lowest 

10% 

Aver¬ 

age 

Upper 

10% 

True-false. 

3.6 

5-4 

7.2 

Multiple-choice (3-6 
choices). 

2.5 

3-7 

5 

Recall. 

2 

3 

4 


The actual number of items of each type which can be done 
in a given time is not, of course, a matter of form only, but also 
of other factors, including difficulty. On the basis of these 
data, elementary-school pupils might be expected to do some¬ 
what fewer. Practical standards for general guidance are sug¬ 
gested in Table ii. While teachers will vary the time in ac¬ 
cordance with conditions, research indicates that tests may be 
taken in the time indicated in Table ii. 

Practically, time is not an important factor in short five-to- 
ten-item tests. Students prefer not to work up to these pos¬ 
sible time limits in actual practice. However, even on the 
elementary-school time basis, secondary and college teachers 
can give a very comprehensive diagnostic test covering an 
important topic or unit in fifteen or twenty minutes. While 
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Table ii. Number of Minutes Required for 
Fifty-Item Objective Tests 


Type of 


Schools 


Item 

Elementary 

Secondary 

College 

True-false. . 

12 

8 

7 

Multiple- 
choice 
(3 response) 

18 

12 

10 

Recall. 

25 

16 

15 


high-school and college students might be expected to do many 
more items per unit of time if the items are of the same diffi¬ 
culty or type, the extent to which problem situations and rea¬ 
soning and thinking are introduced will operate to reduce the 
number which can be answered. 

Test instructions should always be explicit; it is very impor¬ 
tant that students be told whether or not to attempt every 
item, and exactly how such attempts should be indicated. If 
tests are to be used for marking purposes, care should be ob¬ 
served in any assistance which students give in scoring. Pa¬ 
pers may be collected and redistributed, or passed forward, 
backward, right, or left alternately. Scores on short tests may 
be accumulated and used for marking at the end of a quarter or 
semester. Although a single test of ten items possesses low 
reliability, ten such tests added together would possess rela¬ 
tively high reliability. 

The reliabilities of different test forms. The comparative 
reliabilities of various forms of objective tests are not a factor 
when very short diagnostic tests related to daily work are used. 
In longer comprehensive tests used for appraisal purposes, par¬ 
ticularly for final examinations, this factor is important. 
Table 12 presents Ruch’s determination.* 

* Rxich, G. M, The Objective or New-Typa Examination. Chicago, Scott, Foresman and 
Company, 1929. 
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Table 12. Comparative Time and Reliability Data for 
Five Types of Objective Tests 


a 

h 

C 

d 

e 


Time in Minutes 

Items that 
Can be Given 

Reliability 
oir IOO Items 

Reliability 

Type 

TO Complete ioo 
Test Items 

IN 18.7 
Minutes 
(R ecall as Base) 

PER 18.7 
Minutes of 
Working Time 

Recall -. . • 

18.7 


.90 

.90 

5 -response 

16.0 


.89 

.90 

.^-response 

13-5 


• 75 

.81 

2 -response 

it .4 


-85 

.90 

True-false 

ro.2 


• 71 

.82 


From Table 12 it is evident that any well-prepared test 
requiring twenty minutes would possess satisfactory reliabil¬ 
ity, Tests requiring thirty to sixty minutes should possess 
high reliability. This conclusion is, however, dependent upon 
the assumption that approximately the same number of items 
are covered as indicated in column c of Table 12 per 18.7 min¬ 
utes of testing time. The validity of a test is dependent upon 
the selection of test items, which is, in turn, dependent upon a 
knowledge of important objectives. However, to the extent 
that educational activities are vague, undirected, remotely 
related, or unrelated to these important objectives, the test 
will still lack validity. It is impossible to direct learning, to 
diagnose learning difficulties, or to appraise outcomes ade¬ 
quately, without an adequate knowledge of specific desirable 
objectives to begin with. 

Standardized tests and diagnosis. Because an adequate 
discussion of standardized tests involves several technical 
problems, the presentation will be delayed until later. How¬ 
ever, standardized tests usually constitute the first step in diag¬ 
nosis. They reveal to what extent a particular group has 
achieved, learned, or adjusted, as compared with other similar 
groups. An examination of test results reveals what particu¬ 
lar aspects of learning or adjustment need immediate attention, 
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and suggests the particular diagnostic procedures which will 

be helpful in identifying the causes of failure and success. 

QUESTIONS AND EXERCISES 

I. What is your concept of the relation between the objectives of 
educational activities and the items of diagnostic tests and scales? 

3. Indicate the relation between different types of objectives and 
diagnostic procedures. Why do teachers so frequently test for 
only one or two types? 

3. What is meant by the functionality of objective elements of learn¬ 
ing situations? 

4. Explain the assumptions presented regarding the relationship be¬ 
tween specific test items and atomistic learning, and give your 
point of view. 

5. What general difficulties are met in all diagnostic testing? 

6. What dangers should be avoided in constructing true-false tests? 
In what way would you modify or add to the suggestions given? 

7. Discuss the difficulties and procedures for constructing recall tests; 
multiple-choice tests; matching tests. 

8. Discuss the problems involved in diagnosing inadequate oral learn¬ 
ing and adjustment. 

9. What principles should be followed in constructing diagnostic 
product scales? 

10. Define personality adjustment. Discuss the relationship betw^een 
specific elements such as facts, understandings, and skills and ad¬ 
justment. To what extent do you accept the suggestions for 
maldng progress in diagnosing and appraising inadequate learning 
and adjustment? How would you modify them? 

11. Discuss the time and relative reliabilities factors of diagnostic 
procedures. 

12. Make a ten-minute diagnostic test on this chapter, involving the 
use of true-false, recall, and multiple-choice items. Make a tentative 
scale for diagnosing and appraising some character trait or adjust¬ 
ment pattern. 
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CHAPTER FIVE 


DIAGNOSIS AND APPRAISAL IN 
PRIMARY LEARNING 


Evidence from many sources indicates a need for better diag¬ 
nostic procedures in the primary grades, particularly during 
the first months of the first grade. It has been found that as 
many as one-fifth of all first-graders still fail to learn to read 
adequately in some school systems. Because the school year 
is divided into quarters or semesters, these pupils are carried 
along from twelve to eighteen weeks in situations which baffle 
and defeat them, and many develop unfortunate behavior pat¬ 
terns which seriously detract from the effectiveness of subse¬ 
quent efforts to aid them. Either they are not ready to at¬ 
tempt reading, or their efforts are not adequately guided; the 
solution of the problem depends largely upon the use of diag¬ 
nostic procedures which yield more accurate and continuous 
information regarding their abilities, problems, and difficulties. 

Special problems of diagnosis. This level of development 
is characterized by special problems of diagnosis. Most chil¬ 
dren cannot read or write at this age, hence the possibility of 
paper-and-pencil diagnosis is reduced to the minimum. Pupils 
differ greatly in the extent to which they have become inde¬ 
pendent of their parents. Many are exceedingly shy, and 
great tact is at first necessary to induce them to participate in 
group activities. Others have developed opposite types of 
behavior and grasp every opportunity to exhibit their accom¬ 
plishments. Although most beginners can communicate 
freely with each other, they have no common problems and 
ideas, hence one of the major objectives of the pre-primer 
period is to provide such a basis before actual reading begins. 
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In addition, some other presumably desirable primary ob¬ 
jectives are still very vague and general. It is difficult, there¬ 
fore, to diagnose all needs and difficulties. However, if, before 
she begins work, the teacher compiles as specifically as possible 
the objectives she expects pupils to attain, it will aid greatly 
in guiding growth and adjustrnent. No matter how often she 
modifies this list in the light of later knowledge it will have 
served its purpose. Without convictions or clear-cut ideas of 
important desirable outcomes, learning activities cannot be^ 
adequately directed and many pupils will continue to develop 
undesirable attitudes and inadequate behavior habits. 

The needs of each child should be determined in the light of 
his own capacities, characteristics, and previous development; 
his interests should be utilized in directing his growth. Diag¬ 
nosis should be individual, thorough, and systematic. 

Physical diagnosis. No teacher should attempt to supplant 
the regular physician or the specialist in detecting physical 
handicaps that interfere with learning and the development of 
normal healthy personality. However, unless teachers accept 
a much larger responsibility for health, many children will con¬ 
tinue to suffer from remediable defects, because they receive 
little private medical aid. Even in schools where physicians 
are employed and physical examinations are conducted regu¬ 
larly, the results are too often filed where teachers can use 
them only with inconvenience. These important data should 
be entered on cumulative record cards in possession of the 
teacher. 

Curvature of the spine and visual difficulties are the most 
common physical defects, over twenty per cent of all school 
children suffering from them. Muscular weaknesses and inac¬ 
tivity account for most curvature. If eye defects are not cor¬ 
rected, children develop squints because of eye strain, the 
nervous system is affected, and certain personality disorders 
are encouraged. More than twenty per cent of all children 
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A teeth. Children are more suscep- 

have defective or e during puberty 

tible to tubercu^ from home, of 

than at o er ranHine suffer from malnutrition. Most 

excellent economic standing, suner 

TsAiess exists or develops during early school years. 

Without special training or overemphasis on physical diffi- 
cuZ it is possible and desirable for teachers to become m- 
crLingly sldllful in detecting the symptoms of many physical 
difficultii of children and reporting them to the proper author- 
fties The following outline ■ is very useful ^ a d-agnosdc 
instrument: it points out a wide range of evidences of diffi¬ 
culty, most of which are easily detected and may be ublized to 

advantage by teachers. 

GROUP I Indications of Health Disorders in Children Wmcu 
GROUP Trained to Notice and to Report 

TO Constituted Authorities 

Signs: 

Nausea or vomiting 

Chill, convulsions (fits) „ , , ■ , r 

Dizziness, faintness, or unusual pallor (alarming paleness of the 

face) 

Eruption (rash) of any kind 
Fever 

Running nose 
Red or running eyes 
Sore or inflamed throat 
Acutely swollen glands 
New cough 


New cough 

Any distinct or disturbing change from usual appearance 
Hurt of child 


or con- 


The foregoing signs should be used by teachers as a basis for ex¬ 
cluding pupils from school for the day or until the signs have disap¬ 
peared or until the proper health officer has authorized the return 
of the pupil to school. 

- Prepared by Thomas D. Wood, M.D.. Columbia University, and used with his per- 
missioQ. 
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Group II. Signs of Abnormality Pointing to More Chronic 
Disorders Which Should be Remedied Early 


Signs: 

Mouth-breathing 
Loud breathing 
Nasal voice 
Catarrh 
Frequent colds 
Offensive breath 
Chronic cough 
Deafness 
Twitching of lips 
Headache 

Headache 
Crossed eye 
Squinting 

Holding book too near face 

Decayed teeth 
Crooked teeth 
Discoloration of teeth 
Offensive breath 

Inability to hold objects well 
Spasmodic movements 
Twitching of eye, face, or any 
part of body 
Nail-biting 
Perverted tastes 
Sex disturbances 

Pain in feet 
Toeing out markedly 
Flatfoot gait 

Swelling, puffiness of feet 
Excessive perspiration of feet 

Unequal height of shoulders 
Flat chest 

Round neck and shoulders 
Stooping 


Disorders of nose, throat, ear, and 
organs of respiration 


Eye disorders and defects 


Teeth defects 


Nervous disorders 


Defects of feet 


Incorrect posture 
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Group III. Indications op Disturbance Which are Important 
IN Connection with Other Signs of Physical Disorder 


Signs: 

Deficient weight 

Pallor 

Lassitude 

Perverted tastes (food) 

Slow mentality 

Peculiar or faulty postures 

Underdevelopment 

Excessive fat 

Low endurance 

Disinclination to play 

Fatigue 

Pigeon-toed gait 
Shuffling, inelastic walk 
Exaggerated knee action in 
walking 

Shifting from foot to foot 
Standing on outer edge of feet 
Standing on inner side of feet, 
heels turned out 
Locking knee 

Leaning against wall or desk 
Shoes run over at either side 
Wearing out soles asymmet¬ 
rically 

Twitching of foot muscles 


Nutritional and general disorders 


Defects of feet and legs, and de¬ 
fective movements 


Psychiatric diagnosis. The terminology of psychiatry is 
relatively new to teachers, yet many of its problems are very 
familiar. Teachers have long been definitely aware of certain 
peculiarities of children's behavior, but have not understood 
their significance. By learning what is normal and when devi¬ 
ations are regarded as abnormal, teachers will be able to detect 
many incipient signs of nervous and emotional difficulty. 
Parents are particularly helpless in this capacity, because they 
always “understand” their children even when they are defi- 
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nitely psychopathic. When they do suspect a difficulty, they 
are inclined to “cover up ” because they consider it disgraceful 
to have a suspicion of insanity in the family. 

Everyday living produces a continuous series of problems 
and the necessity for choosing among several methods of solu¬ 
tion. Some choices are normal and desirable; others are con¬ 
sidered abnormal. The child who shows early tendencies to 
escape his problems through day dreaming, who develops 
obsessions or compulsions, or is unduly negative in his attitudes 
and conduct, who is extremely selfish, or who fails to progress 
from infantile behavior, is giving evidence of abnormal tend¬ 
encies, and should receive aid in the proper solution of his 
difficulties. 

Personality and conduct diagnosis. This type of diagnosis 
is closely related to psychiatric diagnosis. In this type, diffi¬ 
culties and deviations may be due entirely to the environment 
of the child. The former often involves actual nervous and 
mental difficulties. 

Personality and conduct patterns are often set so early that 
it is advisable to give close attention to the social adjustment 
of children from the very first. However, teachers find this 
difficult because of the lack of normative material; that is, 
information about what is normal and desirable at succeeding 
ages. Many children now attend nursery schools. On this 
level, Joel has produced a carefully developed rating scale to 
detect relative emotional and social behavior maturity. By 
observing the child in various situations, the teacher is apt to 
detect extreme deviations in personality and conduct on this 
level. By means of a scoring system of weights, a final 
score gives additional information for interpreting the rating 
obtained. Table 13 presents four items from the Joel 
scale. 


I J06I, Walther. “Behavior Maturity of Children of Nursery School Age”; in Child 
Development, vol. 7, no. 3, (September, 1936.) 



104 DIAGNOSIS AND THE DIRECTION OF LEARNING 

Table 13. Illustrative Items from Joel's Behavior Maturity 

Rating Scale 


Scoring 

Weight 


8. How does he face his difficulties? (for example, when 
crossed) 

1. Calmly, peacefully settles difficulty without appeal to 
adult (or older brother, etc.). 

2. Makes a fuss, but settles difficulties without appeal to 

adult. 

3 . Attempts to settle difficulty, but also appeals to adult. 

4. Does not attempt to settle difficulty; appeals to adult 
for help. 

5 . Tantrum. 


9- 


For how long can he be absorbed in an activity? 

1. Returns to an unfinished activity of the previous day, 
continues and develops it. 

2. Pursues an activity until it is finished and is not dis¬ 
turbed by other attractions, but does not return to it 
from one day to the next. 

3. Stays with one activity until something more inter¬ 
esting comes up (goes from one constructive activity 
to another). 

4. Often interrupts a constructive activity without tak¬ 
ing up another one. 

5. Shifts aimlessly about. 


18 . Does he see another's viewpoint? 

1. Applies "fair play” of his own free will, even where it 
affects him adversely (he takes turns, co-operates, 
shares even favorite toys and with children who are 
not his special friends)- 

2. Takes turns, etc., of his own free will, with friends 
only. 

3. Is willing to take turns, etc., but only when he is re¬ 
minded by adult. 

4. When reminded by adult, he sulks but takes turns, etc. 

5. Even when reminded by adult, he refuses to take turns 
(argues, cries, tantrum, etc.). 
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ti ways out. When the environment of the child or adult 
Dmes too complex or difficult, he seeks to adjust in many 
irent ways, some of which are socially approved, and some 
not. Figure 9 presents a chart ^ indicating certain char- 
iristic behavior mechanisms. 

he mechanisms which usually receive social approval con- 
of substitution, rationalization, and compensation, while 
hanisms socially tolerated include identification, pro¬ 
ion, and egocentrism. The less acceptable and definitely 
[equate methods of attempting to adjust have been sug- 
ed in connection with psychiatric diagnosis and appear in 
ire 9. These mechanisms will receive more extended con- 
ration in later chapters.=^ 

esearch reveals the importance of objective normative 
erials, as the basis of diagnosing conduct, because teachers’ 
tudes and conclusions differ widely on identical behavior, 
eneral, the following, among others, are desirable character- 
s of children in the primary grades: 

. Participates in work and play program without emotional strain 
. Never or rarely unhappy for long 

. Obeys requests of teacher and group decisions of class 
.. Co-operates and follows, but also takes an occasional turn at 
leading 

. Faces facts when in difficulty; presents his side of controversies 
calmly 

. Willing to consider another’s ideas; does not know it all 
. Takes responsibility and pursues it until completed, even though 
interrupted 

. Calm in reaction to minor hurts, physical or otherwise; never or 
rarely unkind 

. Finds something else to do when regular tasks are completed 
. Tries new ways of doing things 

ashore, R. H., and Katz, Barney. "An Operational Definition and Classification of 
il Mechanisms''; in Psychological Record, vol. i, no. i (March, 1937, p. 9); also Shaffer, 
The Psychology of Adjustment. Boston. Houghton Mifflin Company, 1936. 
e particularly Chapter Nine. 
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11. Willing to admit legitimate failure without blaming others or 
feeling chagrin 

12. Takes just criticism or punishment without resentment 

13. Asks for help when in doubt or difficulty after reasonable effort 
to succeed alone 

14. Never unduly worried, hurried, or inactive 

If the natures, needs, and capacities of children were well 
understood and an environment was provided in the light ol 
these facts, it is probable that nearly all children would gen¬ 
erally exhibit the above characteristics. However, a large 
percentage of each group shows periodic signs of deviation. 
Formerly such behavior was met with repression and punish¬ 
ment. Significant deviation, however, does not generally indi¬ 
cate punishment, but constitutes a symptom of difficulty. It 
is the responsibility of teachers to discover and remove the 
causes if possible. 

Diagnosis of mental maturity. While children usually enter 
school on the basis of chronological age, their differing mental 
maturities also present serious problems of adjustment. Many 
teachers must direct the learning activities of large entering 
groups. Mental maturity furnishes one important clue in 
beginning certain types of work and in directing most types. 
For example, children average 72 months mental age at en¬ 
trance, but research reveals that they should probably not 
begin reading until a mental age of 76 or 78 months has been 
attained. Nevertheless, many enter before even 72 months 
mental age has been attained. 

Perhaps the most widely used individual test of mental ma¬ 
turity is the Stanford-Binet.^ A recent revision provides two 
scales, L and M. It possesses high reliability. It yields not 
only an M.A. and I.Q., but much specific detailed information 
concerning the nature of the child’s intellectual maturity 
which may be useful in guiding his learning activity. Figure 10 

' See Terman, L. M„ and Merrill. M. A. Heasuring Iitielligence, p. 344. Boston, 
..ioughton MiiSin Company, 1937. 



Three Credits 



Two Credits 



FIG. lO. SCORING KEY, FORM M, STANFORD REVISION OF BINET 

TEST, YEAR IV-6 

From Tennan, L. M., and Merrill, M. A., Measuring Intelligence, p. 344. Boston, 
Houghton Mifflin ComDany. Reproduced by permission of the publishers. 
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presents the scoring key for the picture completion of a bird 
test. Throughout, scoring is objective and personal opinion is 
reduced to a minimum. 

Many have found that the Sullivan Psychographic Record 
Blank ^ aids in the analysis and interpretation of the Stanford- 
Binet test results. With this blank, test results are classified 
into the following fourteen types; 

1. Language comprehension 

2. Spontaneous interest 

3. Motor co-ordination 

4. Attention 

5. Spacial relations 

6. Discrimination 

7. Comparison 

8. Comprehension (of practical situations and how to meet them) 

9. Memory 

10. Orientation in time 

ri. Strength of mental imagery and ability to manipulate it 

12. Maturity of apperceptive processes 

13. Associative processes 

14. Higher thought processes (thinking) 

Such classification, with indications of success or failure in 
each, provides the teacher with many suggestions for guiding 
learning which are not evident from a mere inspection of test 
results or which might be lost in transmitting test results from 
the examiner to the teacher. 

Most group tests of intelligence or mental maturity yield 
only an M.A. from which an I.Q. may be computed. Evidence 
previously presented revealed the limitations of the I.Q. for 
individual guidance. It is desirable that the more or less inde¬ 
pendent, yet interrelated, abilities which together appear to 
constitute mentality or intelligence be evaluated separately, 
thus providing for the teacher a group of factors possessing a 
closer functional relationship to teaching and learning than the 

* Sullivan, Elizabeth T. Fsychagraphic Record Blank, Loa Angeles, Calif,, California 
Teat Bureau, 1926. 
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» I n The California Tests of Mental Maturity ■ were 

M.A. or .Q. orovide such information. Both the 

HpveloDed in the effort to proviuc ® _ . 

prl-primary (kindergarten and entering first grade) and pn- 

ZZ (grades a, and 3) batteries possess the following fea- 

tures: 

I Detect pupils who possess visual, auditory, or co-ordinadon diffi- 

cultLsufficientlyserioustopreventavalid group test, Thesame 

te»t mav be given individually to these pupils. 

, tvleral language, and non-language I.Q.'s. The first may be 
usrf as a meiure of general group ability and for compan^n with 
?0’s obtained from other tests; the latter two suggest the most 
profitable general types of pupil activity, verbal and non-verbal. 

, More specific measures of memory, spacial re ationsh.p, reasoning, 
and apperceptive factors. The test manual suggests the appro¬ 
priate types of activities for deficiencies in each of these factors so 
dial teachers may use these results lor guidance purpo^s._ 

The automatic diagnostic profile. The teacher may obtain rela¬ 
tive measures of maturity in five general factors and sixtwn more 
snecific types of activity without any computation whatever: 
checking of the scores on the profile automatically reveals the 
status of each. 


4- 


Figure ii reproduces ’Test 5 (delayed recall) of the pre- 
primary battery. A story is read to pupils, and some twenty- 
five minutes later they answer questions by checking the 
appropriate pictures. 

Diagnosis in reading. An important step in preparation 
for reading is the determination of mental age. While an 
M.A. of 76 to 78 months is desirable, teachers must regularly 
work with beginners below this age, and in general they must 
be prepared to give more attention to pupils as they range 
below this level. Where possible, children may ^ kept in 
transition, or pre-reading groups, until they attain a satis¬ 
factory mental age. 

Many children have not developed the necessary physiologi- 

. Sullivan. E. H., Clark. W. W.. and Tiega. E. W. California of Mental Maiurity 

I^a Angeles’, Calif.. California Test Bureau. 1936-37. 



FIG. 11. TEST 5-DELAYED RECALL 

California Testa of Mental Maturity, Pre-Primary Battery. 
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cal eye-movement maturity to begin reading when they enter 
the first grade; others suffer from visual or auditory defects 
which may prove a serious handicap to progress. All entering 
ciiildren should receive careful eye and auditory 
examinations before being allowed to attempt the complex of 
activities known as reading. Where no better facilities are 
available, such devices as the Snellen ^ or Bausch and Lomb ^ eye 
charts may be used to detect crude defects of vision. Because 
watches vary so in the intensity of the tick, this test of auditoi-y 
acuity must give way to something better. If the teacher 
will place pupils in the back seats of the classroom, and talking 
in an ordinary tone of voice, make simple requests of individual 
pupils, she will discover if they can hear what she says when 
teaching reading. 

However, much more satisfactory devices are available. 
Hearing should be tested with a Western Electric or McCallie 
audiometer. Such instruments are not expensive and should 
be standard equipment in the modern school. The picture 
on the opposite page is an illustration of the 4A audiometer.^ 
It is easy to use, and several pupils may be tested simultane¬ 
ously. The number of units of hearing may be definitely 
determined and utilized in seating, special help, and the like. 
The picture opposite page 113 illustrates the method of using 
the audiometer. 

Ideally, all children should receive an eye examination by a 
competent physician before attempting reading. However, 
with the use of the Betts '* Ready-to-Read Tests, teachers may 
obtain, among other types of information, fairly accurate 
measures of far-and-near-point fusion, vertical and lateral 
imbalance, depth perception, far-and-near-sightedness, and 
astigmatism. The instrument reveals when eye difficulties are 

* Published by the E. W. Reynolds Company, Los Angeles, California, 

» Published by the Bausch and Lomb Optical Company, Rochester, New York. 

3 Distributed by the Western Electric Company. 

’ Betts, E. A. Ready-to-Read Tests. Meadeville, Pa., Keystone View Company, 1934. 




FIG. 12. A 4A AUDIOMETER 










FIG. 13. USE OF THE 4A AUDIOMETER 
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sufficiently serious to refer to a physician. It is illustrated 
opposite page 118. Pupils should be ready to read before they 
are allowed to attempt the Betts Ready-to-Read Tests; other¬ 
wise they may become disinterested, unresponsive, or attempt 
to prevent the teacher from discovering their disabilities. 

Certain tests of reading readiness which utilize basic capaci¬ 
ties, as well as skills already acquired, may be used. Among 
these are the Van Wagenen, the Hildredth and Griffith, the 
Monroe, and the Lee-Clark Reading Readiness ^ tests. Table 
14 shows the probable per cent of failures among those taking 
the Lee-Clark test. 


Table 14 . Probable Per Cent of Fau-ures at Each Score 
Level of the Reading Readiness Test 



The authors call attention to the importance of delaying 
reading until children are ready to read or risk danger of devel¬ 
oping serious personality defects. 

The Monroe Reading Aptitude Tests ® are designed primar¬ 
ily to survey and classify pupils entering the first grade. It 
consists of both “group ” and “individual ” items, and includes 
visual, auditory, motor, articulation and language sections. 

' Lee-CIark Manual. Reading Readiness Test. 'Los Angeles, Calif., California Test 
Rnreau, 1931. Reproduced with permission of the publishers. 

’ Published by Houghton MlfOiii Company, Boston, 1P3S. 
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The reliability of the tests is high. Like the Lee-Clark Reading 
Readiness Test it provides norms which predict probable 
reading ability. Figure 14^ presents three items from the 
auditory test in Sound Blending. 

During the pre-primer period, children should learn at least 
one hundred of the words which will be necessary when they 
transfer to books. Merely “teaching” these words is no 
guarantee that they will be learned. Fifteen to twenty per 
cent of all beginners who finish the first grade are unable to 



FIG. 14. AUDITORY TEST 2 - SOUND BLENDING 

Monroe Reading Aptitude Tests. 

* From the Monroe Reading AfitUude Tests. Houghton Mifflin Company, Boston, 1935. 
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read first-grade materials satisfactorily. This is due largely to 
inadequate diagnostic testing and evaluation. It is extremely 
important that teachers be aware of the exact status of each 
pupil and that every effort be made to aid every pupil to attain 
the minimum objective of one hundred words. Short lists of 
five to ten w^ords may be mimeographed and pupils asked to 
mark them as the teacher pronounces. Children may pro¬ 
nounce words while pointing to them on the blackboard. 
Standardized tests provide another valuable type of test of 
word recognition. Tests like that presented in Figure 15 ^ from 



saw 

run 


man 

met 


bird 

fire 

bring 

hard 


made 

ball 


play 

baby 


FIG. 15 . ILLUSTR-\TION OF TEST OF WORD RECOGNITION 
Gates Primary Reading Test. 

« From the Gaits Primary Reading Test. Bureau of Publications, Teachers College 
Columbia University. New York, 1926. 
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the Gates Primary may be used even though it contains some 
words not taught; words not taught may be ignored, even 
though many pupils will recognize some of them. 

Oral reading is also important as a diagnostic technique. 
Before pupils are sufficiently advanced to take regular paper- 
and-pencil tests, correct oral reading constitutes an important 
evidence of comprehension. Again, oral reading reveals diffi¬ 
culties of pronunciation, enunciation, and the like. Gray has 
developed a series of standard test sets appropriate for grades 
one to eight. As the pupil reads, the teacher notes errors of 
pronunciation, enunciation, substitution, omission, and the 
like, and also secures the normal reading rate. The following 
excerpt from the directions will indicate how the test is used: 

a. The time record is secured by noting the exact second at which the 
pupil begins reading a paragraph and the time when he completes 
it. The number of seconds required to read the paragraph should 
be recorded in the margin to the right of the paragraph. 

&. In order to illustrate clearly the character of the errors and the 
method of recording them, the following paragraph is presented; 

The sun pierced into my large windows. It was the opening of 
October, and the^sky was a dazzling blue. I looked out of 
my window down the street. The white housef) of the long, 
straight street were (^most painful to the eyes. The clear atmos ¬ 
phere allowed full play to^ tj^ sun’s brightness. 

If a word is wholly mispronounced, underline it as in the case of 
“atmosphere.” If a portion of a word is mispronounced, mark 
appropriately as indicated above, for example, “pierced” pronounced 
in two syllables; sounding long a in “dazzling”; omitting the j in 
“houses,” the al in “almost,” or the r in “straight.” Omitted words 
are marked as in the case of “of” and “and”; substitutions as in the 

» From the Gray Oral Reading Check Test. BloominEtton. Ill., Public School Publishing 
Company, ip 23 . 
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case of "many” for “my”; insertions as in the case of "clear”; and 
repetitions as in the case of "to the sun’s.” Two or more words 
should be repeated to count as a repetition. 

Record the exact nature of each error as nearly as you can. When 
you are unable to define clearly the specific character of an error, 
underline the word or portion of the word mispronounced. Be sure 
you put down a mark for each error. In case you are not sure that 
an error was made, give the pupil the benefit of the doubt. If the 
pupil has a slight foreign accent, distinguish carefully between this 
difficulty and real errors. 

Even though eye tests show no defects before reading begins, 
pupils often develop them through misuse, or suffer accidents. 
Furthermore, tests with the eyes at rest do not tell the whole 
story. Adequate reading requires definite eye control and 
specific eye movements. It is desirable, therefore, in. the case 
of actual or suspected difficulty to have eye movements photo¬ 
graphed during the act of reading. Figure 16 presents certain 


1 

II. I' 

IQ 

Plate 2. Oral-reading record of subject CW. Record subsequent 
to that shown In Plate l. 


J 4 i ft r 8 » 10 It II 

he hypotheses conjeernirg pljiys cal phenejm-j 

I X Mfti |»i 


3 3B1 BB4 «aJX7 ai ■ la tl 17 14 lO 

III. degree ^xpjrcjssionjs.j ijn aj se|cc|r|d dejgree «:xp|-«|5s|on, j|t jleasjt orje 

• •4 fax »>••« • • STS I •« 4 ft 


Plate 9. Silent-reading record of subject GH. Passage from a 
text in algebra. 


11 4ft • Tl 

IV. Hay-fev<^ i| a veryj pain^l, though not a dangerjousj, disease. 


Plate 12. Silent-reading record of subject 104. Rapid, super¬ 
ficial reading, 

FIG. 16. EYE MOVEMENTS IN TYPICAL SILENT AND 
ORAL READING 


Charles Hubbard Judd and Guy Thomas Bua%veU, Silsnf Reading: A Study of the Va¬ 
rious Types, Supplementary Educational Monographa. No. 33. Chicago: Department of 
Education, University of Chicago, 193a. 
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information related to the adequacy of eye movements.- Each 
vertical line represents a pause or fixation. The eyes can 
recocnizc symbols only during fixations. 1 he numbers at the 
tops of vertical lines show the order in which fixat.ons occurrri. 
The duration of each fixation is indicated by a numte at the 
bottom of each vertical line, expressed m twenty-fifths of a 
second: when an a: appears, it indicates an 

A numlier of workers have devised cameras for obtaining 
the type of data shown in Figure i 6 . Clark • of the Lmversity 
of Southern California invented one which -=>“0^003 
horiaontal and vertical records of both eyes. The Ophthalm- 
0 -Graph * Illustrated facing page 119 >a another such camera. 
Light is reflected from the subjects eyes ^ ^ 

reading and photographed on a moving film. Among the types 
rf informati™ which may be obtained from such photographs 

are the following: 

1. The number and duration of fixations 

The number and duration of regressions * t*. 

7 The wordhi phnases, and types of materials which are mmt difficult 
7 The adeciuacy of the haliits of reading from left to right 
X The adequacy of eye control in following lines 
6 Whether or not subject reads phnua^s as the timl of si>an 
r. Whether nr not eyes work ,wrfec.ly u^e.her or whether vernral or 
horizontal inilwilance exists 

The Ophthalm-O-Graph record docs not reveal the degree 
of comprehension of the materials read: but comprehension 
tests should accompany the.use of the Ophthalm-O-G.aph 
Where an Ophthalm-O-Graph or -similar instrument is not 
available, the teacher may place a mirror on the page of .an open 
book and while the pupil reads the other page she may observe 
and record movements from behind the pupil. However, such 
informal tests do not pos-sess the validity of photography. 

I r r W and BiiBWcil. G. T.. ItrtiwcrsUy wf Chicasio- 

j Dtalributwl hy iht AmcrlCAn Onllcal CdmjWTsy. 




FIG. 17 . BETTS REA.DY-TO-READ TESTS 
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Perhaps the outstanding contribution of the Ophthalm-O- 
Graph is to reveal the exact nature of eye movements. Some 
research has shown that eye movements (hence rate of read¬ 
ing) can be improved through training. The most ambitious 
and comprehensive attempt to train eye movements has 
resulted in the development of the Metron-O-Scope illustrated 
on the opposite page. 

The Metron-O-Scope is a device with a long narrow opening, 
back of which is an electrically driven roll like that of a player 
piano. The copy of the roll is so placed that the first, second, 
and third equal parts from left to right are visible in turn. 
Since only one part is visible, there can be no serious regressive 
movements; the eyes are trained in the natural left to right 
movement. The amount of time allowed for each fixation may 
be varied and adjusted to the needs of individuals and classes 
according to the varying difficulties of different words, phrases, 
and types of materials. The flash-card principle has been 
utilized long and successfully in reading and arithmetic, and 
should prove valuable in this form. 

Buswell of the University of Chicago has developed a varia¬ 
tion of the Metron-O-Scope idea in the form of a film pro¬ 
jector. Stories of appropriate nature and difficulty are filmed 
by phrases. By changing the speed at which the film passes 
through the machine, subjects may receive practice in reading 
at a rate which may vary from less than 200 to above 650 words 
per minute. This plan has some distinct advantages, since 
so many schools already own projectors, and carefully devel¬ 
oped films can be duplicated with relatively little expense. 

It is sometimes difficult to determine whether bad eye move¬ 
ments are the causes of symptoms of poor reading. Eye-move¬ 
ment patterns can easily be changed by a variety of means, 
including differences in motivation and the nature and diffi¬ 
culty of reading materials. Yet in spite of individual differ¬ 
ences between pupils, growth of reading ability should be 
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fairly continuous for each. Diagnosis of learning difficulty 
should be contiiuious, therefore, in order to afford necessary 
guidance. 

However, the teacher who desires to know if pupils at vari¬ 
ous stages are making normal progress faces a problem. Such 
questions as '* Has pupil grown in ability?” ” Does pupil show 
evidence of desirable adjustment.^?” ” Is the student making 
prograss? ” usually ustjd as tests or criteria, may all l>e answered 
in the affirmative even for failing pupils without revealing any 
significant specific information, since most pupils normally 
grow this w?ay. It is impos.sibIe to judge if the work of a 
month, quarter, or semester represents normal progress with¬ 
out the use of some carefully developed normative lest. It is 
desirable, therefore, to give all pupils at least once each quarter 
or semester a diagnostic test which reveals not only the impor¬ 
tant learning difficulties, but at the same time reveals w'hether 
or not individual pupils, as w'ell as the class a.s a whole, are 
making normal progress. Pupils who show mort? or less con¬ 
tinuous difficulty should be given such tests at more frequent 
intervals. 

Figure 20 presents the profile of a reading test which auto¬ 
matically reveals to the teacher tlie extent to which certain 
abilities have been acquired. The heavy vertical line shows 
the actual grade placement of the pupil. 

The lighter lines connect the scores actually made on six 
different aspects of reading ability. Note that by merely 
checking on the horizontal lines the scores made on these 
aspects, and laying a ruler vertically through these paints, the 
teacher may without any computation determine the grade 
and month to which performance corresponds. When the 
graph of pupil ability falls close (near or to the right of the 
heavy vertical line), it reveals to the teacher that no special 
remedial work is necessary. However, if performance falls 
significantly to the left of the heavy vertical line, she should 



Diagnostic Profile 

Grade (Chart Pupil’s Scores Here) 

_ o Possible Place- „ , 

Test Subject score Pupil’s score ment Grade Placement 



Grade Placement 

FIG. 20. COMBINED DIAGNOSTIC AND NORMATIVE READING PROFILE TEST GIVEN IN JUNE TO A 2A PUPIL 
From Frosressive Primary Reading Tests, Farm A. Los Angeles, CaM., CaUfomia Test Bureau. 
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Utilize some such diagnostic analysis as comprises the second 
page of each test, as presented in Figure 21. 

By reviewing the errors in the test and checking them on the 
diagnostic analysis, the teacher automatically prepares the 
objectives of the individual remedial program for the pupil 
without any arduous record making. As the pupil attains 
these objectives through remedial activity, she may cross 
them off. 

Reading is the most important skill which is utilized in learn¬ 
ing. No effort should be spared in assuring normal progress. 
Since growth in reading ability (and in other desirable out¬ 
comes of learning) is dependent upon a number of important 
related factors, the following have been presented: 

1. Physical diagnoses 

2. Psychiatric diagnoses 

3. Personality and conduct diagnoses 

4. Mental-maturity cliagnoses 

In dealing specifically with the problem of reading, the 
following iiistrumenl.s and devices have been con.sidered: 

r. The audiometer 
Eyecharts 

3. Betts Ready-to-Read Test 

4. Reading readiness tests 

5. Word mastery te.sts 

6. Eye-movement records 

7. Eye-movement photography 

8. Devices for training eye movement 

9. Diagnostic reading tests and profiles 

10. Diagnostic analysis of reading difficulties 

Growth in reading depends not only upon diagnosing learn¬ 
ing difficulties in the skills involved, but in certain concom¬ 
itant habits and attitudes wrhich do not lend themselves so 
easily to quantitative or normative treatment. However, the 
teacher should be aware of them as objectives which motivate 
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FIG. 21 . DIAGNOSTIC ANALYSIS, PRIMARY READING 
From Progrtssive Reading Tests, Primary Battery, Form A. Los Angdes, Calif., California Test Bureau, I 934 . 
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and reinforce learning and constitute indirect evidence of suc¬ 
cessful growth. Among these objectives in growth are the 
following which should be used in diagnosing learning diffi¬ 
culty: 

I, Ability to pronounce words and enunciate clearly in oral reading 

3. Ability to read silently w’ithout pointing and w'ithout lip move¬ 
ments 

3. Ability to attack new words successfully 

4. Ability to utilize marks of punctuation in interpreting oral and 
silent reading 

5. Ability to get information for immediate use 

6. Formation of habit of reading for pleasure 

7. Ability to answer questions on what is read 

8 . Ability to use table of contents, page numbers, etc. 

9. Ability to understand and follow directions in written form 

10. Ability to care for books 

n. Attitude of co-operation in reading activities 

12 . Ability to make deductions and interpretations 

Some of these aspects of reading ability must l>e determined 
through observation; others may be reliably diagnosed through 
tests. Figure 22 presents a directions test, and Figure 23, a 
table of contents test. Many such devices may be utilised 
by teachers to obtain specific reliable information on which to 
direct learning activities in primary reading more adequately. 

0. Draw a line under this word: run 

1. Draw a line under this letter; B 

2 . Write this word: 

3. Put a 0 after this 6 

FIG. 22. DIRECTIONS TEST 

Exc«rpt from Progr^Msim Reading 7'eM, Primary Baitery, Form A. l.<5« Angeles. Calif. 
' '.aUfornla Teac Bureau, 1934 . 
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This is copied from a Table of Contents: 


Chap. Page 

1. Plans for Vacation.. i 

2. The Start. 5 

3. The First Day. 10 

4. The Storm. 13 

5. Sailing the Boat. 18 

6. Returning Home. 25 


Page IS will tell something about. 

Page 7 will tell something about. 

FIG. 23. TABLE OF CONTENTS TEST 

Excerpt from Progressive Reading Tests, Primary Battery, Form A. Los Angeles, Calif., 
California Test Bureau, 1Q34- 


While diagnosis of physical, psychiatric, personality and 
conduct, and mental-maturity factors were presented in con¬ 
nection with reading, they are related to all learning on all 
levels and these basic concepts will not be repeated. Rather, 
such instruments and devices as are appropriate to different 
levels will be considered. 


QUESTIONS AND EXERCISES 

1. What factors create special problems of diagnosis in the early pri¬ 
mary years? 

2. Outline your own plan for primary physical diagnosis. 

3. Indicate important aspects of psychiatric and personality diagnosis. 

4. What are the main features of the JoSl scale? 

5. What are mental mechanisms? 

6. Name several mental mechanisms and indicate whether or not they 
are socially approved. 

7. Discuss the diagnosis of mental maturity on this level. 

8. Outline the major factors in determining reading readiness. 

9. Explain the diagnosis of learning difficulties in reading. 

10. Explain: L and M scales; Sullivan psychograph; diagnostic profile; 
audiometer; Ophthalm-O-Graph; Metron-O-Scope. 
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CHAPTER SIX 


DIAGNOSIS AND APPRAISAL IN 
PRIMARY LEARNING {Concluded) 

Up TO tMi point w« have considered physical, psychiatric, 
personality and conduct, and mental-maturity diagnosis and 
appraisal, and have presented important problems and tech¬ 
niques in dealing with the improvement and measurement of 
reading ability. This chapter will continue the discussion in 
connection with arithmetic, language, writing, spelling, music, 
and art. 

Diagnosis in arithmetic. A recent survey of thirty-three of 
the fifty largest cities of the United States showed that twenty- 
fjeven did not utilize arithmetic books until the third grade. 
For the most part, therefore, numl>er work in the first two 
years is informal. However, thi.s does not mean an absence 
of objectives, even though these differ considerably as to type 
and specificity. Since diagnosis should 1 h^ based on ob¬ 
jectives, the following list of wddely accepted objectives * is 
presented its the point of departure. In using measurement 
in directing learning, it matters not whether this list is used, 
Of some other, or whether this order or some other is followed 
in teaching and learning. Since needs and capacities of pupils 
have not Imen detenmned so carefully here as in some other 
fields, the list may l>e considered tentative: 

I. Ideas of quantity; distinguishing between one or more than one, 
few and many, and more or less. Elementary estimating and 
measuring and verifying 
5f, Counting, abstractly and concretely (25-100) 

* Titg*. ft. W. Tht letarnint in Rt*m*nlmy S’cAdon. N«w Y«rk, 

haug^m, ii*t4 tM?. 
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3* Distinguishing between numbers in a series and numbers which 
designate the magnitude of a group or collection (loth boy or 
lo boys) 

4. Mastery of the oral vocabulary necessary to utilize the ideas of 
points I, 2, and 3, above 

5. Mastering word symbols and digit symbols which correspond 
to the common number ideas and oral vocabulary possessed by 
the group (10 and ten, after idea of this quantity is mastered an d 
utilized orally) 

6. Finding pages in the book 

7. Partial quantities: etc., of concrete groups 

8. U.S. money: penny, nickel, dime, quarter, and half-dollar 

9. Simple combinations with money as well as abstract numbers 

10. Subtraction idea; developed as “taking away” 

ir. Adding (without carrying) one or two columns 

12. Multiplying; developed through addition idea 

13. Simple denominate numbers 

14. Simple long division 

15. Short division 

16. Recognizing simple signs and symbols 

17. Familiarity with and ability to get quantitative information from 
pictures, graphs, sketches, maps, etc. 

A measure of mental age is useful in determining when to 
begin the mastery of various of the above objectives. Wash- 
burne * and the Committee of Seven of the Northern Illinois 
Conference on Supervision determined the minimum and 
optimum mental ages for the fifteen topics listed in Table 15. 

When there is difficulty in connection with any topic, it is 
well to check mental age as a possible factor. This does not 
mean that children who are below these minima cannot learn; 
neither does it mean that pupils at or above the maxima will 
need no help. In general, as pupils fall below these standards, 
more concrete activities, more drill, more detailed diagnosis 
and more remedial work will be necessary. 

Regardless of the particular set of objectives adopted for the 

» Waahburne, C. W. "Mental Age and the Arithmetic Curriculum’'; in Journal of Edu 
eational Research, vol. 33, p. 310. 
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Table i5- Minimum and Optimum Mental Ages for Success 
IN Various Topics in Arithmetic 


Topics 


1. Addition facts 

Sums lo and under. 

Sums over i o. 

2. Subtraction facts 

Easier 50. 

Harder 50. 

3. Subtraction process. 

4 Compound multiplication. 

5. Short division. 

6. Long division 

Easier elements. 

Harder elements. 

7. Meaning of fractions .... 

8. Addition and subtraction of similar fractions 

and mixed numbers, with no carrying. 

9. Multiplication of fractions. 

10. Division of fractions. 

ri. Addition and subtraction of decimals. 

X2. Division of decimals. 

13. Fractional and decimal equivalents. 

14. Case I percentage. 

15. Simple bar graphs (Chronological age). 


Mental Ages 

Minimum Optimum 


¥ r . 

Mo. 

Yr . 

Mo. 

6 

5 

7 

4 

7 

4 

7 

II 

6 

7 

8 

3 

7 

8 

8 

II 

8 

9 

8 

9 

10 

4 



II 

4 



10 

9 



12 

7 



9 

0 

10 

9 

9 

lO 

ir 

I 

12 

3 

14 

2 

12 

3 



10 

11 

12 

6 

13 

0 

14 

10 

II 

6 

13 

10 

12 

4 

13 

II 

10 

5 

10 

5 


informal work of the first two years, pupils who enter each year 
will differ widely in respect to the mastery of any one. And 
regardless of the educational philosophy of the teacher, or the 
types of activities used, success in attaining objectives in pri¬ 
mary arithmetic can be revealed only through proper tests, 
oral and individual, or, later, through written tests. The 
major cause of failure in arithmetic is not inability of pupils to 
succeed, but failure of teachers to detect learning difficulties 
and provide adequate remedial work. Diagnosis is of special 
importance in beginning arithmetic, because, as in reading, a 
whole train of unfortunate results follow a bad start. Also, 
one success with a given objective is not sufficient. Often 
chance plays a part in success, and many children forget what 
they have learned. The teacher must, therefore, diagnose 
regularly for past objectives as well as for current effort. 

Since most teaching properly begins with concrete objects, 
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such objects should be used in testing. In general, it is bad 
practice to attempt to mark pupils while directing educational 
activities; but it is good form to have some such check sheet 
as is illustrated in Figure 24, with pupils listed on the left side 
and specific objectives across the top, in order to be able to 
obtain a record of the exact difficulties of each pupil. 


Names 

OF 

Pupils 

Few 

AND 

Manv 

More 

AND 

Less 

Counting 

Abstract¬ 

ly 

Counting 

Con¬ 

cretely 

Finding 

Pages 

Fraction 

K 


X 




X 





X 

X 




X 

X 





James. 


X 





Oscar. 





X 


Jennie. 







William. 





X 

HHi: 


FIG. 24. DIAGNOSTIC CHECK SHEET 

Such a check list is not a test. It is merely a class record, 
which may be used as the basis of individual remedial work. 
Such a record may be made for a pre-test when beginning new 
topics, or after any diagnostic test. In general, in individual 
oral tests, the pupil may explain or tell in answer to questions, 
or may indicate answers by reference to concrete objects or 
drawings of them on the blackboard. Answers should not be 
suggested. The teacher should be sure that the evidence 
accepted as success is proof of the attainment of the objective 
in question. 

The combinations are important in primary work. Fortu¬ 
nately, they can easily be mimeographed and pupils may easily 
be given the opportunity to reveal any difficulties. The com¬ 
binations should reappear from time to time in short diagnostic 
tests, and occasionally all of those previously learned should 
be retested. However, as soon as the pupil progresses beyond 
the simplest combinations, additional diagnosis often is neces¬ 
sary. The results of an objective test of combinations often 
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will reveal the location of a difficulty, but will not reveal its 
cause. It then becomes necessary to make individual oral 
diagnoses, requiring the pupil to tell what he does and why he 
does it. 

Even though children are too young to write, and standard 
printed tests are not available, much testing for outcomes may 
nevertheless be objective and in group form. For example, the 
whole class may find certain pages of a book at the same time, 
and by a quick inspection of books the teacher can detect 
pupils who are in error. Multiple-choice answers to certain 
questions may be mimeographed, and the test may be given 
orally or on the blackboard so that while the test items may 
be varied the same answer sheets can be used. Numbers may 
be mimeographed as in Figure 25 with twenty or thirty lines 
each, containing all numbers from i to 20 or 25. As the teacher 
presents the problems and gives their numbers, pupils simply 
underline the correct answers. 


Problem 

Number 

Answers 

I 

I 

2 

3 

4 

5 

6 

7 

8 

9 

10 

II 

12 

13 

14 

15 

16 

17 

etc. 

2 

I 

2 

3 

4 

5 

6 

7 

8 

9 

10 

II 

12 

13 

14 

15 

16 

17 

etc. 

3 

I 

2 

3 

4 

5 

6 

7 

8 

9 

10 

II 

12 

13 

14 

15 

16 

17 

etc. 

4 

I 

2 

3 

4 

5 

6 

7 

8 

9 

10 

II 

12 

13 

14 

15 

16 

17 

etc. 

5 

I 

2 

3 

4 

5 

6 

7 

8 

9 

10 

II 

12 

13 

14 

15 

16 

17 

etc. 


FIG. 25. ILLUSTRATION OF SIMPLE ANSWER RECORD SHEET 

In Figure 25, the correct answer to problem no. I was 3; to 
problem no. 2, i, etc. The same principle may be applied by 
mimeographing columns of fairly large clock faces, each iden¬ 
tified with a number. Regardless of the hour, these may 
always serve as the answer sheet. While used at first with the 
whole class, it may later be restricted to those whose responses 
continue to show inability to tell time. 
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While most diagnoses will be made first in terms of careful 
systematic observations (and records on check sheets) and 
later with teacher-made objective tests, it is advisable to 
utilize carefully prepared standardized diagnostic tests from 
time to time. In such tests, most of the important specific 
objectives will appear in proper order and often contain impor¬ 
tant objectives which have escaped the teacher in making her 
own tests. In addition, such tests also reveal to the teacher 
whether or not her pupils are doing as well as other children 
doing the same work. 

Figure 26 presents samples of several test-forms used in 
standardized diagnostic tests.’^ 

1. Draw a line under the right word: 

— means add subtract multiply divide 
+ means add subtract multiply divide 

2. Write in the correct sign: 

4. 4 ~ 8 

6^2 3 

3. Draw a line under the right word: 

$ means time money weight 
hrs. means time money weight 

4. If Harry had 2 marbles and John gave him 4 more, how many 

would he have then? Answer. 

In this problem I 

(Underline the right word) 
add subtract multiply divide 

FIG. 26. ILLUSTRATION OF DIAGNOSTIC TEST-ITEM FORMS 

Diagnosis in language. Most language work will be oral 
in the primary grades. Instruments such as the Gray Oral 
Reading Check Tests, described in connection with diagnosis 
in reading, will aid in checking certain skills as well as in reveal¬ 
ing speech defects which may interfere with the development 
of language ability. In addition to habituating correct forms 

* From the Progressive Achievement Tests, Primary Sattery, Form A. Loa Angeles, Calif., 
the California Test Bureau, 1934. 
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of speech, the teacher sets up situations which create the desire 
and, if possible, the necessity for communication. Pupils 
differ in the number of people with whom they can com¬ 
municate, hence another objective consists in leading them 
gradually from conversation with individuals to easy and 
thoughtful exchange of ideas with the whole class participating. 
Since so much of the activity is oral at first, rather than writ¬ 
ten, the teacher must depend largely upon individual diag¬ 
nosis. The major cause of the wide differences in outcomes, 
and the persistence of certain language errors, is not lack ol 
pupil ability. Teachers simply have trusted to general rou¬ 
tine instead of making checking lists of important objectives, 
and recording the difficulties of pupils as language activities 
proceed. Such check lists constitute the basis of intelligent 
remedial programs. 


Names 

OF 

Pupils 

Some Objectives 

Pronuncia¬ 

tion 

Enuncia¬ 

tion 

Desire to 
Participate 

Posture 

Presence 

Usage 

c 

V 

B 

dj 0) 

H cu 

Modula¬ 

tion 

Story ■ 

Sense 

Dramatiza¬ 

tion 

Mary. 

X 

■ 


X 

■ 

■ 



■ 


John. 

X 


X 


■ 



X 

■ 


William. 


■ 


X 

H 

■ 



■ 

X 


FIG. 27. DIAGNOSTIC RECORD SHEET-LANGUAGE 


Figure 27 represents one type of diagnostic record sheet. 
Since evidence of achievement and growth is oral and illusive, 
such a sheet should be in constant use. Each teacher should 
also be alert in detecting difficulties in objectives presumably 
attained previously. 

Later, in the second and particularly in the third year, when 
writing and spelling can be used as tools, tests involving cap- 
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italization, punctuation, good usage, and the like may be 
utilized. Figure 28, for example, presents such a punctuation 
test.* The sentences should be in familiar language and 
involve familiar uses of all marks of punctuation learned up 
to the time the test is given. 

IL Put periods, commas, question marks, or quotation marks where 
they are needed in these sentences. 

1. Will you be at the party 

2. Mary please pick up the book 

3. Bread meat and milk were sent to the poor people 

4. Do we write Feb. for February 

5. The fruit was divided equally among John Mary and Sarah 

6. Mr. Williams said I shall bring you a box of candy 

FIG. 28, SAMPLE OF PUNCTUATION TEST 

Diagnosing writing difficulties. Writing difficulties are rela¬ 
tively simple to detect since a permanent record is left. In 
diagnosing difficulties and planning remedial work, care must 
be observed not to shift the emphasis from writing as a means 
of expression and communication to writing as an end in itself, 
where beauty, rhythm, and art aspects are too much stressed. 
Children differ considerably in motor co-ordination, hence many 
members of a given class will not reach the standard set for 
the grade. Many teachers misinterpret norms on writing and 
other scales. When a class as a whole reaches but does not 
exceed the standard norm, approximately half the pupils will 
be below the norm, and some of them considerably below. 
This is the normal situation, and diagnosis and remedial work 
will be stressed with those who do not have adequate skill. 

Some errors are due to poor eyesight, some to lack of atten¬ 
tion, some to poor writing position, some to lack of motor 
co-ordination, and others just to failure to practice intelli- 

' From Los Angeles Diagnostic Tests: Language, Form z. Grades 3 —ff. Los Angeles, Calif. 
Published by the California Test Bureau, ipay. 
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gently. Since the number of variables in diagnosing writing 
is limited and work is always available for inspection, no 
teacher diagnostic record sheet is usually necessary. How¬ 
ever, as soon as pupils are able, it is advisable for them to com¬ 
mit to writing the individual diagnoses which the teacher 
makes from time to time to serve as the basis for their practice 
efforts. Among various types of errors, the following are most 
common. 

1. Poor word (and letter) forms ^ 

2. Wrong color (due to wrong pen position) 

3. Poor alignment 

4. Poor word spacing 

5. Poor beginning and ending strokes 

6. Poor letter spacing 

7. Wrong size of letter 

While at first writing may be informal, beginning with the 
second grade handwriting scales may be used. The first step 
in diagnosis involves the use of a scale of general merit, such as 
that presented in Figure 29. 

When pupils are distinctly below standard, it is desirable to 
determine the cause. Ordinarily pupils should not be made 
too conscious of writing form until the concept of writing to 
express ideas is well set. From about the third grade on, diag¬ 
nostic scales may be used to advantage. Figure 30 presents 
such a chart for diagnosing color. Since color is intimately 
related to pen position, a number of other difficulties are likely 
to disappear when color is corrected. 

Spelling diagnosis. While misspelled words may be rela¬ 
tively easily identified, the location of a spelling difficulty does 
not reveal its cause. Among the most frequent causes of 
spelling difficulty are the following: 

1. No use for word; no motivation for learning 

2. Meaning of word not understood 

* In the beginning the child should leam to write words as wholes; m this way he can use 
writing to express himself before he can spelL 
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italization, punctuation, good usage, and the like may be 
utilized- Figure 28, for example, presents such a punctuation 
test.* The sentences should be in familiar language and 
involve familiar uses of all marks of punctuation learned up 
to the time the test is given. 

II, Put periods, commas, question marks, or quotation marks where 
they are needed in these sentences. 

1. Will you be at the party 

2. Mary please pick up the book 

3. Bread meat and milk were sent to the poor people 

4. Do we write Feb. for February 

5. The fruit was divided equally among John Mary and Sarah 

6. Mr. Williams said I shall bring you a box of candy 

FIG. 28. SAMPLE OF PUNCTUATION TEST 

Diagnosing writing difficulties. Writing difficulties are rela¬ 
tively simple to detect since a permanent record is left. In 
diagnosing difficulties and planning remedial work, care must 
be observed not to shift the emphasis from writing as a means 
of expression and communication to writing as an end in itself, 
where beauty, rhythm, and art aspects are too much stressed. 
Children differ considerably in motor co-ordination, hence many 
members of a given class will not reach the standard set for 
the grade. Many teachers misinterpret norms on writing and 
other scales. When a class as a whole reaches but does not 
exceed the standard norm, approximately half the pupils will 
be below the norm, and some of them considerably below. 
This is the normal situation, and diagnosis and remedial work 
will be stressed with those who do not have adequate skill. 

Some errors are due to poor eyesight, some to lack of atten¬ 
tion, some to poor writing position, some to lack of motor 
co-ordination, and others just to failure to practice intelli- 

» From Los AnfieUs Diagnostic Tests; Language, Form i, Grades j-g. Lob Angetes, CalJf. 
Published by the California Test Bureau, I9a>j, 
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gently. Since the number of variables in diagnosing writing 
is limited and work is always available for inspection, no 
teacher diagnostic record sheet is usually necessary. How¬ 
ever, as soon as pupils are able, it is advisable for them to com¬ 
mit to writing the individual diagnoses which the teacher 
makes from time to time to serve as the basis for their practice 
efforts. Among various types of errors, the following are most 
common. 

1. Poor word (and letter) forms * 

2. Wrong color (due to wrong pen position) 

3. Poor alignment 

4. Poor word spacing 

5. Poor beginning and ending strokes 

6. Poor letter spacing 

7. Wrong size of letter 

While at first writing may be informal, beginning with the 
second grade handwriting scales may be used. The first step 
in diagnosis involves the use of a scale of general merit, such as 
that presented in Figure 29. 

When pupils are distinctly below standard, it is desirable to 
determine the cause. Ordinarily pupils should not be made 
too conscious of writing form until the concept of writing to 
express ideas is well set. From about the third grade on, diag¬ 
nostic scales may be \ised to advantage. Figure 30 presents 
such a chart for diagnosing color. Since color is intimately 
related to pen position, a number of other difficulties are likely 
to disappear when color is corrected. 

Spelling diagnosis. While misspelled words may be rela¬ 
tively easily identified, the location of a spelling difficulty does 
not reveal its cause. Among the most frequent causes of 
spelling difficulty are the following: 

1. No use for word; no motivation for learning 

2 . Meaning of word not understood 

• In the beglnntng the child Should Icam to write words as wholes; In this way he can use 
writing to express himself before he ctin spell. 



CoMECT Color. To have correct color you should bend your thumb 
so sharply that the tip of it lifts the pen in your hand, and you should 
hold the pen far enough from the point to give the nails of your two 
little fingers room to carry your hand comfortably. 



Too Light. Does your writing look like this? If it does, why is it hard 
to read^ Too light writing may be caused by holding the pen too far 
from the point, too nearly perpendicular, or turned so that the eye of 
the pen is underneath. Which do you do? 



FIG. 30. COLOR CHART, GRADES THREE AKD FOUR 

Nyatrom, Ellen C. Stlf^Corrtcitve Handwrihni Charts. St. Faul, Mlnn.r 

St. Paul Book and Stationery Company, 1937. U&ed with permls^on of the author, 


Shaded Curves, Does your writing look like this? If it does, why 
is it 50 hard to read? Shaded curves are caused by holding your pen so 
that the eye of the pen is turned toward the left. 




Heavy Downstrokes, Does your writing look like this? If it does, why 
is it hard to read? Heavy downstrokes are caused by pressing on the pen 
with the forefinger. 





Too Heavy. Does your writing look like this? If it does, why is it so 
hard to read? Too heavy writing is caused by holding the pen so near the 
point that there is not room enough for the nails of the two little fingers 
to carry the hand comfortably. 






FIG. 30. COLOR CHART, GRADES THREE AND FOUR 

Nyetrom, Ellen C. Mhineapohs Self-Corrective Handwriting Charts, St, Paul, 
St, Paul Book and Stationery Company, 1927. Used with permission of the author. 
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Any method of diagnosis must take these five factors, among 
others, into account, but the attainment of these major ob¬ 
jectives requires attention to many more specific objectives. 
Among the difficulties which interfere with activities and retard 
normal progress are the following; 

1. Defective hearing 

2. Defective eyesight 

3. Defective speech 

4. Poor habits of attention 

5. Poor memory 

6. Tonal deficiency 

Before children are seated to begin rote singing, each pupil 
should be required to sing some simple song which he knows, 
or imitate tones sung by the teacher or blown on a pitch pipe. 
The identification of monotones is important, since they should 
be seated in the front of the class and receive special attention. 

Singing in concert constitutes a large part of the work, and 
there is constant danger that pupils will merely drift along, 
leaning on others, making little contribution, and failing to 
attain the objectives set up. For this reason, it is important 
that the teacher circulate among pupils and listen to them as 
they sing in concert, detecting errors and correcting them as 
far as possible without interfering with the singing. Among 
the factors to be diagnosed are the following: 

Does each pupil 

1. participate effectively and with enjoyment? 

2. give excellent attention? 

3- maintain correct position? 

4. sing in time? 

5. sing in correct pitch? 

6. enunciate clearly? 

7. actually contribute, not merely follow? 

8. possess correct voice placement? 

9. use soft smooth tones? 

10. memorize easily and remember songs previously learned? 
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Diagnosis consists of keeping these and other desirable ob¬ 
jectives in mind while passing close enough to pupils to detect 
difficulties. By working with each pupil in turn as the song 
goes on, diagnosis and remedial work may be continuous. 
However, complete diagnosis (as well as efficient learning) 
requires individual singing. As this goes on, the teacher, both 
for the purpose of diagnosis and for the purpose of providing a 
permanent record, may check the appropriate items on a 
diagnostic record sheet. As in other fields, children fail to 
make normal progress or fall short of their possibilities in 
music because their teachers do not know well enough the im¬ 
portant specific unattained objectives which are causing 
trouble. 

Diagnosis in art. Because of uncertainty concerning the 
art needs and abilities of children, the desirable objectives of 
the curriculum in art, and the identification of specific ob¬ 
jectives agreed upon in principle, diagnosis on the primary level 
is difficult, and a relatively large measure of personal judgment 
is introduced. Good art depends upon adequate vision, ob¬ 
servation, hearing, motor co-ordination, visual memory, and 
spacial ability, among other factors. These must become basic 
factors of diagnosis when pupils do not make normal prog¬ 
ress. 

The major objectives of art in general are appreciation and 
enjoyment, creative activity, utility, discipline, and vocation 
ability. Primary art is concerned primarily with the first 
three. Notice that the third mentioned objective is creative 
activity. Children of the first and second grades go through 
a manipulative or symbolic stage when any line or dot may 
represent literally anything. The objective is not to create a 
true likeness of an object, since this is beyond the capacity of 
most children at first; the objective is manipulation, freedom, 
action, and the use of crayolas and paint to produce splashes 
of bright color. Theoretically, at least, this is creative activity 
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since it develops some of the fundamental abilities upon which 
achievement in art depends. 

Near the beginning of the third year the attitudes of most 
children undergo a change. They are no longer satisfied with 
mere lines and splashes of color to represent their ideas, but 
desire their work to represent reality. From here on, the 
teacher diagnoses difficulties in terms of these new objectives. 

Art, like writing, leaves a record which may be used for 
partial diagnosis. For the purpose of providing a continuous 
permanent record diagnostic in nature which may be used as the 
basis of remedial work, the teacher should commit to writing 
all of the definite and specific, as well as more general ob¬ 
jectives which she hopes to aid children to attain as outcomes, 
From time to time, she should check these on her diagnostic 
record sheet for all pupils. Such a diagnostic device will 
contain items like the following: 

Does the pupil 

1. manifest genuine pleasure in manipulating art materials and 
media? 

2. manipulate media freely? 

3. use art vocabulary correctly? 

4. join in discussions about art work? 

5. understand and follow directions? 

6. manifest orderly habits of work? 

7. manifest orderly habits in caring for materials? 

8. show co-operation in joint products? 

Where special difficulties arise and one or several of the 
manifestations are unsatisfactory, the next step is to discover 
the causes, if possible. Among the types of tests which may 
reveal the causes of difficulty are the following: 

1. Vision 

2. Hearing 

3. Motor co-ordination 

4. Spacial ability 
5- Visual memory 



diagnosis and appraisal m PRIMARY LEARNING I45 

Art concepts and vocabulary, and knowledge of materials 
and media, should also be checked in case of continuing 
difEculty, 

Character and personality. So far as the school’s con¬ 
tribution is concerned, good character and personality result 
from a complete well-balanced program of educational activ¬ 
ities, and not from a mere special type of discipline or program 
aimed directly at desirable traits and conduct. Punctuality, 
accuracy, persistence, and the like are achieved in relation to 
dehnite activities and responsibilities. Diagnosis, therefore, 
utilizes the lists of desirable traits (reduced to specifics as far 
as possible) not as generalities more or less separated from 
school activities, but in connection with the aspects of living 
to which they are related. Diagnosis of personality difficulties 
is, therefore, not co-ordinate with difficulties in reading, music, 
or physical education, but permeates the whole program of 
educational activities. 

QUESTIONS AND EXERCISES 

1. Discu&a the relation of specific obj ectives and diagnosis in arithmetic. 

2. What relation, if any, exists between arithmetic activities and mental 
maturity? 

3. When would you use individual oral diagnosis? 

4. Outline the major factors in diagnosing learning difficulties in lan¬ 
guage; writing: spelling. 

5. Discuss diagnosis of learning difficulties in music; in art. 

6. What factor creates a special difficulty in diagnosis in art? 

7. Indicate the relationship between character and personality diag¬ 
nosis and the general school program. 

B. Assume that you are a primary supervisor. Outline for your teach¬ 
ers a complete plan for diagnosis of learning difficulties on the pri¬ 
mary level. Explain why you wish them to utilize this plan. 
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CHAPTER SEVEN 


DIAGNOSIS AND APPRAISAL IN 
ELEMENTARY EDUCATION 


The mastery of reading and writing skills attained by most 
children in the primary grades contributes much to the ease 
and effectiveness of diagnosis in the elementary grades. Many 
diagnostic activities are, however, carried over. Reading 
problems, physical handicaps, and inadequate adjustment, 
among other difficulties, persist even to the university level, 
and devices and equipment described in the previous chapter 
are used repeatedly to discover the causes of difficulties. 

It is well for teachers to remember that the accuracy of 
diagnosis depends not only upon the device used, but also upon 
the user, inasmuch as there are wide individual differences in 
the care and discrimination with which teachers prepare and 
use tests, scales, and check sheets. It is also true that no pro¬ 
cedures now available for diagnosis should be regarded as per¬ 
fect or in final form, since changing objectives and increasing 
knowledge of diagnostic and appraisal techniques result in the 
gradual improvement of diagnostic procedures themselves. 
It is nevertheless true also that the teacher who uses the diag¬ 
nostic procedures now available can direct learning and adjust¬ 
ment much better than the teacher who imitates modern 
learning activities, but continues to depend upon estimates 
and snap judgments in the diagnosis of learning difficulties. 

General plan for diagnosis. The general plan for effective 
diagnosis and guidance may be indicated as follows: 

I. Study previous test and Informal records of each new group to 
obtain as much information as possible regarding the ability, 
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achievement, and personality-adjustment of each member of the 


2 . 


3 - 


4 - 


rive standardized diagnostic tests of the tools of learning and 
achievement tests in the content fields. Note the status of the 

class as a whole. . 4. i + v ^ . 

Compare the results with intelligence or mental-maturity test 
data When such data are missing or are over two or three years 
old new tests should be given. Note the areas m which the class 
as a whole is significantly below expectancy. Ihis will indicate 
in a general way the types of objectives and learning activities 

which should receive special attention. _ 

Make for each individual pupil who is significantly below ex¬ 
pectancy a list of personality adjustment or conduct difficulties 
and skill or achievement shortages, and plan conferences with 
pupils as well as class and individual activities in the light of this 
information. Relatively few pupils in any class will require very 
long or exhaustive lists. 


Such a plan indicates what is meant by adjusting learning 
activities to the needs and capacities of children who should 
be guided, inspired, and directed, as far as possible, in terms 
of their interests. Actual experience in many practical situa¬ 
tions indicates that these methods are more efficient and ef¬ 
fective than the practice of drifting along on successive waves 
of enthusiasm for a variety of procedures which recognize no 
definite obligations toward any child, and provide no definite 
means for determining his capacities or needs. The happiness 
of children, on the basis of which such programs are too often 
defended, need not be attained at the expense of desirable 
growth and development. The out-of-school world into which 
children are gradually integrated requires that they also strug¬ 
gle and persist in school activities if they are later to become 
useful citizens* Happiness is an inevitable concomitant and 
active agent in a school program which challenges but does not 
defeat its students. 

Personality and adjustment diagnosis. Traditional teacher- 
training methods as well as limitations of time and equipment 
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are important factors contributing to the present difficulties of 
adjusting the whole school program to the varying needs of all 
children. As a result, many types of undesirable tensions are 
produced and certain maladjustments are actually encouraged. 
Even though teachers are fast learning not to attempt the 
direct destruction of symptoms of inadequate pupil adjustrnent 
or undesirable behavior through mere disciplinary measures 
the causes of many difficulties are complex and difficult to 
determine. There is a growing realization, however, that the 
hope of progress depends upon the use and improvement of 
such techniques and devices as are now available, even though 
they may be far from perfect. 

Among the evidences of maladjustment characteristic of the 
typical classroom are the following; ^ 

I , Disinterest in school work 
Wasting time 

3. Disturbing other pupils 

4. Silliness 

5. Unnecessary or excessive tardiness 

6. Speech difficulties 

7. Truancy 

8 . Lying 

9. Bullying or quarrelsomeness 

10. Impertinence 

ir. Disobedience 

12. Overactivity 

13. Constant complaining 

14. Temper outbursts 

15. Masturbation 

16. Stealing 

17. Writing obscene notes 

18. Cheating 

Each of the above evidences of maladjustment may be due 
primarily to one or to several causes, If these causative 

» Haggerty, M. E. "The Incidence of Undeairable Behavior in Public School Children"; 
fn Journal of Educational Researcht vol. 13, no. 3 . 
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factors can be identified and removed, the probabilities of 
normal adjustment are greatly increased. Cheating, for ex^ 
ample, was studied by the Character Education Committee of 
the Los Angeles Elementary Principals' Club, whose Behavior 
Check List is presented in Figure 31. 

Each of the types of maladjustment presented above and 
others as well are due to a simple or complex combination of 
causes. Check lists of reasons should be developed through 
systematic observation and other types of research, and pri¬ 
mary effort directed at the removal of such causes. The re¬ 
moval of causes often involves changes in the pupil himself 
and changes in the environment outside of school, but for 
the most part is concerned with adjustments in the school 
program. 

A test like the Raubenheimer Test of Potential Delinquency ^ 
may be used to advantage with new groups. This test consists 
of seven parts: (r) overstatement on books read; (2) over- 
statement on knowledge claimed; (3) reading preferences; (4) 
social preferences; (5) social attitudes; (6) activity preferences; 
and (7) offense rating. In the knowledge overstatement test 
appear such questions as the following: 

1. Do you know who discovered America? 

2. Do you know who wrote Huckleberry Finn? 

The pupil scores his own test and adds up his score. A second 
part of the test which he takes immediately afterward contains 
items of this type: 

1. America was discovered by Drake, Columbus, Balboa, Cook. 

2. Huckleberry Finn was written by Alger, Dickens, Henty, Mark 
Twain. 

While some will make honest errors, pupils who are prone to 
overstatement, or show other types of undesirable deviation, 
may be identified. 

■ Raubenheimer, A. S, An Experimental Study of Behavior Traits of the Potentially Deiijr- 
Suent Boy, Psy^chological Monogr'-'b, no, ijp, loag. 
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Los AnEclcs Elementary Principals' Club. 
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The general impression that only a very few pupils show evi¬ 
dences of undesirable behavior was shown to be erroneous in a 
study of one typical school. Briefly, the significant findings 
were: 

1. Fifty-one per cent, a much larger proportion of children than ig 
generally supposed, exhibited some forma of undesirable behavior. 

2. Undesirable behavior wag not restricted to any particular group, 
but occurred from ages 6 to i6. 

3. The common belief that boys manifest a larger amount of unde¬ 
sirable behavior than do girls was confirmed. 

4. There is a decided relationship between intelligence and undesir¬ 
able behavior. There is least undesirable behavior when intelli¬ 
gence is average; the curve of undesirable behavior rises gradually 
for intelligence above average, but rises sharply for intelligence 
below average. 

The California Test of Personality = is designed "to identify 
certain highly important factors in personality-adjustment 
usually designated as intangibles." Its major purpose is to 
reveal how successfully or unsuccessfully tlie pupil is adjust¬ 
ing to the problems which confront him. Each profile is 
divided into two parts. One part reveals what he feels and 
thinks about himself; the other, how he functions as a social 
being. For the most part the items of the tests are not 
included to obtain reactions to situations which are right or 
wrong but to discover how the pupil feels about them. 

A diagnostic profile reveals graphically the extent, if any, to 
which each pupil deviates from the manner in which pupils in 
general adjust to the same circumstances. The following are 
three items from the elementary battery: 

1. Are your tests so hard or unfair that it is right to cheat? 

2. Do your classmates quarrel with you? 

3. Do you suffer more than most people when you are ill? 

* IlagEcrty, M. E. "The Incidence of Undesirable Behavior In Public School Children’'; 
III Journal of Educalional Researcht vol. 12, no. 2, ' 

•Thorpe, L P., Wlllla Clark, and E, W. Tlegs. The CaU/ornia TesI of Personality, 
Lob Angelea, Callf., California Teat Bureau, J930. 
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When a profile reveals adjustment difficulties the teacher 
turns to the appropriate section of the manual for explanations 
of causes and for suggestions on effective guidance. Tenden¬ 
cies to “paint self-portraits better than the originals ’ are 
nullified by disguising the intent of items as far as possible and 
by using checking devices. 

The Haggerty-Olson-Wickman * Behavior Rating Schedules 
provide another type of opportunity for diagnosing the nature 
and extent of undesirable behavior as the basis of remedial 
activity. The excerpt on page 154 illustrates the types and 
forms of items which appear in Division IV of the schedule. 

Naturally, the more serious the deviation from normal the 
greater the effort that should be expended in identifying and 
in removing, if possible, the causes. Among them are physical 
defects, incorrect grade or subject placement, wrong or inade¬ 
quate social experiences, personality clashes with parents or 
teachers, mental immaturity, poor work habits, and, occa¬ 
sionally, significant nervous or emotional difficulties. 

Physical diagnosis. The old attitude that the health of the 
pupil or his behavior outside of school is no business of the 
teacher still persists in a limited number of communities. 
Among teachers themselves who have long been in the service, 
something of the same attitude often exists. But the modern 
teacher knows that physical factors are often significant in 
learning. While she cannot compel the parent or school to 
remedy physical defects, she can aid in detecting them. 

Woods's outline of significant factors related to health and 
physical fitness has been presented in Chapter Five. In general, 
the same evidences should be sought in the elementary grades. 
Too frequent toothache, chronic catarrh, or malnutrition may 
make the difference between comfort and adjustment, and mal¬ 
adjustment. In general, the teacher should be on the alert for 

’ Haggerty, M. E., Olson, W. C., and WIckman, E. K. Behavior Rating Schedules. 
Copyright, 1530, by World Book Company, YonkerB-on-Hudson, New York. Reproduced 
by wrltl/en permlssian. 
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the evidences of physical difficulty presented in Table i6. 
Care must be observed not to offend sensitive children or par¬ 
ents, or children from homes where little medical care is pos- 
sible. ^ 

Some of these defects can be readily recognized and their 
seriousness determined by the use of equipment in the schools. 
For the most part, however, when these symptoms are present 
the pupil should be referred to a nurse or physician 

Modifications should be made in the pupil's responsibilities 
and activities to compensate for physical difficulties. Special 
checkups on lighting are advisable for pupils with defective 
vision. The disadvantages which pupils with slight astigma¬ 
tisms suffer at low intensities largely disappear at ten foot- 
candles of light. Cross lights of unequal intensities and direct 
rays of the sun, or reflections from blackboards should be 
avoided. The eyes adjust more slowly to differing intensities of 
light than the rate at which light fades from the sky in late win¬ 
ter afternoon, hence artificial lights should be used to aid the 
transition. It is inadvisable to cover the upper sashes of 
windows. Doubling the height of windows trebles the illu¬ 
mination in the middle and rear of the room. Dirty windows 
and figured frosted glass may reduce the illumination to 40 or 
50 per cent of normal. 

All children should be as physically fit as possible. Although 
some possess definite hereditary limitations, physical growth 
and development are aided by positive physical and health 
education programs, objectives of which include, among other 
things, knowledges, understandings, attitudes, and ideals, and 
actual skill in performance of physical activities. 

The Athletic Badge tests,^ for example, consist of separate 
sets of measurements of athletic ability for boys and girls. 
Boys' stunts include chinning, rope climbing, dashes, jumps, 


* AtMetic Badge Tests far Bcyys and Girlsr Elaneutary 
Recreation AsaoclaLlon of America. WaBliIngton, D.G. 
of Education, ip33. 


and High School, Play^ound and 
, Department of Interior, Bureau 



Table i6. Defects and Their Common Evidences ' 


Defects 

Common Evidences 

I, Vision 

Poor reading, reading too close to book, headaches * 
inflained eyes, and lids, lack of interest in reading, 
dizziness, etc. 

2, Hearing 

Blank expression, expressionless voice, lack of at¬ 
tention, errors in following directions, listening 
posture, etc. 

3, Motor 

co-ordination 

Poor writing, Jack of neatness, awkwardness, poor 
at games, etc. 

4. Posture 

Round shoulders, head tilted forward, repressed 
chest, protruding abdomen, etc. 

5 . Feet 

Locking knees, standing on outer edge of feet, pain, 
toeing out or in too much, leaning against wall, 
shifting from foot to foot, dat feet, swelling, etc. 

6. Nose and 
throat 

Moutb-hreathing, cough, deafness, offensive breath, 
headache, red or running nose, nasal voice, etc. 

7. Teeth 

Speech difficulties, bad breath, toothache, promi¬ 
nent or broken teeth, etc. 

8. Nutrition 

Pallor, Jack of weight or excessive fat, avoiding 
play, short of breath, puffiness of face or eyes, etc. 

9, Nerves 

Temper tantrums, restlessness, irritability, truancy, 
tics, twitching of muscles, nail-biting, etc> 

ro, Heart 

Shortness of breath, easily fatigued, pallor, blue 
lips, cold perspiration, irritability, etc. 

It. Tuberculosis 

Loss of weight, afternoon fevers, lack of appetite, 
fatigue, pallor, shortness of breath, etc. 

12, Communi¬ 
cable skin 
diseases 

Individual isolated pus sores? impetigo, dry scaly 
slightly red circular patches; ring worm, irritating 
rash between fingers and flexor surfaces of wrist as 
well as other parts of the body; itching scalp, head 
lice, white nits on hair, etc. 

13. Other com¬ 
municable 
diseases 

Extreme flush or pallor; chilis or fever, rashes or 
spots, nausea or vomiting, stiff neck, extreme diz¬ 
ziness, pains in neck or chest, etc. 


> Adapted from Ontline of Procedure for Guidance in Blemeitiary Schools, Los Angelea, 
Calif., Board of Education, Education^ Research and Guidance Section, (Sept. ip37.) 
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baseball throw, and shot put; girls’ stunts include balancing, 
races, volley ball, tennis serving, and basket-ball throwing. 
These tests show up lack of physical development and various 
disabilities, and results may be used to direct remedial activi¬ 
ties. Lack of knowledge about health may be diagnosed with 
the aid of the Gates-Strang Health Knowledge Test.^ Postural 
defects may be diagnosed with the aid of posture scales or sil- 
houettographs. Many other tests have been developed to 
measure more or less general objectives like strength, physical 
agility, poise, and general health. While some possess rela¬ 
tively low reliability for appraisal purposes, specific items of 
the tests are frequently useful for diagnosis. 

Mental-maturity diagnosis. Major considerations regard¬ 
ing diagnosis of mental maturity have already been presented 
in Chapter Two. However, errors in testing and record-keeping 
as well as changes in the environment of pupils and in pupils 
themselves make it desirable to give tests of mental maturity 
at least every two years. Whereas it is desirable to give at 
least one individual test to each pupil entering the Bi grade, on 
the elementary level, pupils are mature enough to take group 
tests. Nevertheless, where group-test results or achievement 
differ markedly from expectancy, or where unusual learning 
problems develop, it is always well to check with individual 
tests. All pupils entering from other school systems for whom 
no adequate data are available also should be tested. 

Class averages of mental-maturity tests furnish reliable pre¬ 
dictions of class achievement; individual test scores provide 
less reliable predictions. At this point, it is well to review the 
data on this subject in Chapter Two, 

It is desirable to obtain for all pupils experiencing significant 
learning difficulties a measure and analysis of the separate 
factors of mental maturity. This may be done through an 

* Gates, A, L, and Stninff, R Galcs-Slrang HtaUh Knowledge Test, Form I, Grades j-ja. 
New York. Bureau of PubUcatlona, Teachers College, Columbia University, i99£. 
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individual test, or through a group test ^ that breaks down the 
complex of abilities now generally expressed in a single M.A, or 
LQ. For example, while there is overlapping in the factors 
of memory, spacial ability, verbal ability, mathematical rea¬ 
soning, and logical reasoning of pupils obtaining the same 
I.Q., one may be strong in memory or verbal ability and an¬ 
other weak. Obviously, the teacher planning remedial work 
should proceed differently with these two pupils. Also, two 
pupils having learning difficulty may have significantly differ¬ 
ent I.Q.’s, but be equally mature in development of memory 
or verbal abilities. Remedial activities fitted to their needs 
would, therefore, be differentiated on factors other than these. 

It is very important in administering group tests of intelli¬ 
gence or mental maturity to ascertain that pupils can see and 
hear well enough to do themselves justice, and that they are 
free from muscular or emotional control difficulties which 
would block their efforts. 

The tools of learning. Pupils making normal social adjust¬ 
ments, possessing no significant physical defects, and enjoying 
normal mental maturity, still experience educational diffi¬ 
culties. In fact, most of the teacher's diagnostic problems are 
of this type. Formerly, teachers taught “harder,” put in 
extra time, scolded, cajoled, and did almost everything except 
to get at the causes of learning difficulties. Whereas the con¬ 
tent of courses, subjects, units, or topics may vary consider¬ 
ably, the skills necessary to deal with them adequately are rela¬ 
tively stable. Formerly reading, writing, and arithmetic were 
taught primarily as ends in themselves and at one time con¬ 
stituted the major objectives of education. They are fre¬ 
quently referred to as the skill and drill subjects and are often 
confused with the skills or tools of learning, certain essential 
elements of the old skill subjects together with others which 

* At the predent time the California Tests of Mental Maiurity are the only group teats of 
mental maturity that deliberately attempt to break down the M.A, and I.Q. Into factora 
which are more closely related to teaclilna and Jeamljig'. 
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are necessary in further learning. Thus, reading skills today 
mean something much more complex and useful than the tra¬ 
ditional concept; arithmetic is no longer ability to solve unique 
and useless problems or to "finish the book" first, but repre¬ 
sents skill in dealing with quantitative ideas; language is not 
mere ability to imitate good form, but ability to utilize ideas 
and concepts in thinking individually and collectively. 

Whereas tests of mental maturity theoretically reveal capac¬ 
ity to achieve, tests of the tools of learning show to what extent 
the skills necessary in achievement have already been attained. 
Tests of these learning skills predict success in other types of 
achievement (social studies, science, and so forth) more accu¬ 
rately than do tests of intelligence. Whereas the relationship 
between individual I.Q, and general academic success averages 
thirteen per cent better than chance, academic success can 
often be predicted from skill tests to the extent represented by 
a coefficient of correlation of .85, or almost fifty per cent 
better than chance. 

It is important, therefore, that data on the status of the tools 
or skills of learning be available to the teacher at the beginning 
of each semester or term. From them, together with other 
evidences, the teacher may detect pupil weaknesses and plan 
special assistance in tool mastery. If these tools are neglected 
or unattained during the primary years, the resulting difficulties 
will accumulate during the period of elementary education. 
Many high-school and college students fall far short of their 
possibilities or fail entirely because they never mastered these 
tools. Throughout the school program, directly and indi¬ 
rectly, these skills need to increase to deal with the increasing 
complexity of objectives on higher levels. They will receive 
attention under their separate categories of reading, arithmetic, 
language, writing, and spelling. 

Reading diagnosis. As the pupil matures and advances he 
passes from the mere mastery of reading skills, as such, to their 
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utilization in attaining other objectives. Among the many 
Solul skills which should be mastered and utilized are the 
WlowLg. Their absence results in difficulties and must be 

detected through diagnosis. 

T Skill in adding new words and ideas 
2 Skill in interpreting new constructions, idioms, etc. 

Skill in detecting the main idea of a paragraph or selection 
A Skill in finding the answers to questions 
t Skill in skimming to determine the utility of materials 

sSll in finding details related to a given problem or question 

t Iwi in remembering the main ideas of sen™ and paragraphs 

necessary for gaining new meanings as reading proceeds 
8 Skill in verifying opinions, assertions, or beliels 
9. sun in maUng inferences and drawing conclusions as reading 

,0 SUu'i^"predicting the probable outcomes of a given set of cir- 


cumatances 

Diagnostic tests which could detect these and other shortages 
would be BO long that teachers would not use them, hence most 
diagnostic reading tests are too narrow in function, sampling 
only a limited number of these desirable skills. The above 
objectives also involve the use of tables of contents, indexes, 
maps, globes, encyclopedias, and the like, and pupils should 
be tested for these among other skills when they are fading to 


make normal progress. 

The determination of whether or not progress is normal 
requires the use of standardized tests, since only by comparison 
of pupil achievement with a standard can a relatively valid 
conclusion be reached. Some standardized tests are survey or 
inventory tests, useful primarily for obtaining grade place¬ 
ments and for administrative and supervisory purposes; others 
are diagnostic as well, providing in addition to grade-placement 
data definite specific evidences of learning difficulties which 
furnish immediate guidance for the teacher. The reading pro¬ 
file of the Iowa Every-Pupil Tests of Basic Skills,* for example, 

1 Horn. Erneat. and McBroom, Maude. Every^Fuptl Testy pf Basic SkiUs, 

City, la,, Bureau, of Educational Rcflcarch and Service, 




fig. 32, READING PROFILE 
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presented in Figure 32, reveals status in ten different reading 
skills for the end of the seventh grade. 

Since these profiles are provided in terms of grades and se¬ 
mesters, they may be still more adequately interpreted within 
these limits by the percentile scale on the left. 

The Progressive Achievement Tests ^ provide a different 
type of profile which also reveals graphically the actual attain¬ 
ment of skills contrasted with the theoretically desirable mas¬ 
tery at succeeding grade levels. Combined with the profile is a 
diagnostic analysis of learning difficulties which isolates and 
identifies the objectives for which teachers should plan re¬ 
medial activities. 

This analysis, presented as Figure 33, is completed only for 
pupils showing marked difficulties, and objectives are crossed 
off as pupils master them. Where reading difficulties persist 
after regular remedial activities have been used certain tests 
described in connection with primary reading should be re¬ 
peated. Oral reading should be a regular, though decreasing, 
part of the work in the elementary grades, both for its speech 
value and for its value as a diagnostic device. Tests like the 
Gray Oral Reading Check Tests previously described facili¬ 
tate diagnosis. Mental age, hearing, vision, and eye-movement 
tests reveal difficulties here as on the primary level. 

Familiarity with good standardized diagnostic tests will 
suggest to teachers many types of items which they may use 
for continuous informal diagnosis. 

After some facility in reading has been gained and it has 
become a tool for learning, measures of rate also reveal impor¬ 
tant information. Data reported by the American Optical 
Company, presented in Table 17, show that training does 
increase rate of reading. While these norms constitute aver¬ 
ages for whole grades rather than for shorter time periods, they 

* Tiega, E. W., and Clark, W, W. Progressive Achievement Tests. Los AngeTea, Calif. 
California Teat Bureau, i£)34. 

* See Chapter Five. 
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FIG. 33. DIAGNOSTIC ANALYSIS IN READING. PROGRESSIVE ACHIEVEMENT TESTS 
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do show gcnGra.1 trends. Differences in ma-terinls, motiVcLtion, 
and other factors should be taken into account in using them. 
Care must also be observed in making comprehension tests of 
materials read, since rate is not significant unless materials are 
understood. 

Table 17. American Optical Company Eye-Movement Norms 


(Data from Ophthalm-O-Grapb readings before and after Metron-O-Scopic 
training. Based on silent reading of standard material for the various 
grade levels.) 



mm 

Sec¬ 

ond 

TtURD 

FotJRTH 

Fifth 

Sixth 

SEV¬ 

ENTH 


Average reading rate, 
words per minute. 

ss 

B 

IIS 

16B 

igo 

200 

210 

S 9 S 

Averages after Metron- 
O-Scoplc training . 

go 

B 

160 

19S 

225 

25U 

280 

3 SO 

Average number of fixa¬ 
tions per TOO words .. . . 

250 

200 

I 7 S 

X40 

I2S 

120 

115 

P 3 

Averages after Metron- 
O-Scoplc training.. 

igo 

ISO 

137 

115 

TOO 

ps 

PS 

88 

Average number of regres¬ 
sions pet 100 words. . . . 

60 

so 

40 

30 

as 

23 

21 

17 

Averages after Metron- 
O-Scopic training. 

40 

33 

2D 

2S 

20 

16 

IS 

13 


To use the table: Compute data of Aim record and compare the number of tixatlona and 
resreasLona and the speed with the appropriate grade level. 

For instance, using the first line of figures in each item — If the aubject's reading rate In 
200 words per minute he la at elxth-grade level In terms of speed. If lie makes no fixatlone 
in reading roo words he la better than seventh grade In terms of fixations; and If he makeil 
17 regreflslons In reading loo words he is average for the high-school level. 

The second line of figurea In each item provides comparable data taken from the reading 
graphs of subjects who had received Mctron-O-Scoplc training. 

The eighteen-month Goodsell study * in which some two 
hundred Los Angeles Polytechnic High School tenth-graders 
with reading ability at the eighth-grade level or below were 
placed in special reading improvement groups instead of the 
regular English classes appears to lend some support to the data 
of Table 17. Students were divided into two groups, one re¬ 
ceiving Metron-O-Scopic training only, and the other engaging 

• C^dsell, J. G. "Some Reading Improvement Results In a Senior High School"; In 
Los Angeles School Journal, vol. 21, no. 33. (May 23, lp3S,} 
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in a number of other remedial activities. Those receiving 
Metron-O-Scopic training gained an average of 9.7 months in 
reading ability, while the average gain for the other group was 
5.9 months as measured by tlie Progressive Reading Tests, 
advanced form. 

The study also revealed the fact that while Metron-O-Scopic 
training apparently produced significant average gain for the 
group as a whole, many students showed no change in grade 
placement, and several actually earned lower grade placements 
after the training. Among those who earned higher scores on 
the second reading test were many whose eye-movement pat¬ 
terns, as revealed by a second series of eye-movement photo¬ 
graphs, had undergone no measurable change. 

In a briefer unpublished study, Goodsell found a definite 
reduction in the number of fixations and regressions, as well as 
an increase in both reading rate and comprehension. 

These studies emphasize the complexity of the reading 
problem and the danger of misinterpreting the significance of 
gains in reading rate. 


QUESTIONS AND EXERCISES 

1. What is adjustment diagnosis? 

2. Describe several techniques used in adjustment diagnosis. 

3. Evaluate the suggested procedures for physical diagnosis, indicating 
how you would modify them. 

4. Why should mental maturity be determined more than once? 

5. Are there similar reasons for repeating adjustment and physical 
diagnoses? 

6. Indicate the importance of diagnosis of the tools of learning and 
explain how they differ from the old concept of the three R's. 

7. Outline procedures used in the diagnosis of reading difficulties. 

8. If possible, diagnose a case of reading disability in a regular class¬ 
room, and report your activity and findings to the class. 
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CHAPTER EIGHT 


DIAGNOSIS AND APPRAISAL IN 
ELEMENTARY EDUCATION 
(^Con eluded') 


In this chapter as in Chapter Five, problems in diagnosis 
and appraisal will be treated in connection with different subject 
fields. This does not mean, however, a subject-matter concept 
of objectives; it means merely that the subject categories 
familiar to and generally used by teachers serve as the organiz¬ 
ing principle. Personality adjustment, physical, mental rnatu- 
rity, learning skills, and reading diagnosis and appraisal were 
treated in Chapter Seven. Arithmetic, language, writing, 
spelling, the social studies, music, the fine arts, the industrial 
and household arts, and elementary science will receive atten¬ 
tion in Chapter Eight. 

Diagnosis in arithmetic. The major objective in elementary 
arithmetic is to obtain a growing control over environment by 
increasing the ability to do quantitative thinking. This major 
objective is attained through the mastery of many related ob¬ 
jectives of the following types, while dealing with real problems 
in the everyday environment and more general problems 
within the understanding of pupils: 

1. Number ideas, meanings, and symbols, and their uses 

2. Processes and computations 

3. Special uses in business, industry, and the improvement in think¬ 
ing generally 

Such more specific objectives as the mastery of the com¬ 
binations, fundamental operations with whole numbers and 
fractions, and operations with decimals are revealed as neces- 
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sary skills and are learned in connection with their uses as well 
as through direct drill when necessary. Diagnostic procedures 
musttake account of the specific elements of these various goals. 

No unusual testing problems are involved in the diagnosis of 
difficulties in combinations and in many other processes; but 
many pupils still fail to master them simply because teachers 
fail to do the necessary diagnostic testing to detect difficulties, 
either as processes or in their uses in the solution of problems. 
As combinations and other processes are learned there should 
be frequent mimeographed tests containing processes learned 
during the week and review computations which have given 
trouble. Diagnosis is not complete until each pupil has made 
a list of his difficulties for study, and the teacher has a list 
showing the number of times each combination has been 
missed. The extent of individualization of remedial work is 
determined by these two types of lists. 

After the period of simple combinations is past, more com¬ 
plicated processes of the same type follow. The number of 
types of possible difficulties is almost legion. A diagnostic 
test will often locate the point of difficulty, but does not reveal 
its cause. Table i8, for example, presents a list of thirty-three 
difficulties in addition.^ 

Such lists are helpful as a point of departure when making an 
individual oral diagnosis for process difficulties which have 
not yielded to ordinary remedial procedures. Similar lists 
should be developed for each process. 

At the beginning of each term, each teacher should have data 
showing the extent to which each pupil (and the class as a 
whole) has mastered previously studied objectives. Figure 34 
illustrates the derivation of grade placement and status in 
several specific elements of arithmetic reasoning and funda¬ 
mentals. 

* Buawell, G. T., and John, Ignore. Diagnostic Studies in Ariihmetic. Chicago, Unlvcr, 
aity of Chicago; Supplementary Educatbhnl Monograph, no. 30. 
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Source or DiFFicui.Ty 


Grade 


Total 

Ill 

IV 

V 

VI 

I. Errors m combinallons. 

81 

103 

78 

58 

320 

2. Counting., . . 

61 

83 

54 

17 

215 

3, Added carried number last. 

39 

45 

45 

26 

155 

4. Forgot to add carried number.. . , 

37 

38 

34 

37 

126 

5. Retraced work after partly done . 

26 

34 

39 

22 

I 2 I 

6. Added carried number irregularly 

26 

30 

28 

18 

102 

7, Wrote number to be carried .... 

34 

25 

18 

12 

89 

8. Carried wrong number. 

28 

19 

26 

14 

87 

9. Irregular procedure in column... 

16 

29 

23 

18 

86 

10. Grouped two or more numbers . . 

25 

22 

21 

16 

84 

ri. Split numbers... 

12 

29 

25 

14 

80 

12, Used wrong fundamental opera¬ 
tion.. 

23 

25 

20 

II 

79 

13, Lost place in column.. 

17 

17 

17 

14 

65 

14, Depended upon visualization. 

24 

8 

27 

2 

61 

15, Disregarded column, position.. . . 

34 

II 

9 

I 

55 

16. Omitted one or more digits. 

T 3 

21 

13 

s 

52 

17. Errors in reading numbers. 

14 

10 

21 


52 

18. Dropped back one or more tens . 

13 

12 

17 


47 

19, Derived unknown combination 
from familiar one. . .. 

13 

7 

II 

HI 

42 

20. Disregarded one column. 

15 

II 

8 

2 

36 

21. Error in writing answer. 

12 

3 

14 

5 

34 

22. Skipped one or more decades.. . . 

11 

7 


5 

32 

23. Carried when there was nothing to 
carry... 

6 

9 


5 

29 

24, Used scrap paper. 

7 

5 


0 

21 

25. Added in pairs, giving last sum as 
answer. .. 

6 

6 


2 

20 

26. Added same digit in two columns 

10 

6 



18 

27, Wrote carried number in answer . 

10 

2 

2 


^5 

28. Added same number twice. 

4 

I 

3 


ir 

29. Began with left column. 

I 

I 

2 


3 

30. Confused columns.. 

I 

0 

0 

0 

I 

31. Added carried number twice .... 

0 

I 

0 

0 


32. Subtracted carried number. 

0 

0 

0 

■1 


33. Added imaginary column. 

0 

0 

I 

B 


Totals.. - .,. 

96 

124 

116 

78 

mm 


The teacher simply locates scores on the lines to the right 
of each, and reads the grade placement on the scale above or 
below. When pupils score significantly below the heavy ver¬ 
tical line representing actual grade placement, data are trans¬ 
ferred to tire diagnostic analysis and an individual remedial 
program planned on this basis. 









































FIG. 34. ILLUSTRATION OF DERIVATION OF GRADE PLACEMENT FROM PUPIL SCORES 
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By combining individual test results teachers may also 
discover objectives generally unattained by the class as a 
whole, and plan appropriate class learning activities. 

The need of process skills in quantitative thinking begins long 
before many of these processes have been mastered. This need 
as well as the use of skills mastered in the solution of quanti¬ 
tative problems motivates a better mastery. However, other 
types of difficulties develop. One of the major sources of error 
is the lack of reading ability, pupils attempting the solution of 
problems without understanding either the answer to be 
sought or the conditions upon which a solution depends. 

Figure 35 is an excerpt from the Compass Diagnostic Tests 
in Arithmetic. The test requires the pupil to go through cer¬ 
tain well-defined steps in orderly fashion. Besides serving as 
a test of quantitative thinking it furnishes a routine which 
should be imitated in learning to think in this field. 

All diagnosis should be in terms of objectives. Computa¬ 
tional objectives are well defined, hence diagnosis is relatively 
simple. Although there is the difficulty in teaching arith¬ 
metic of providing real learning situations rather than tradi¬ 
tional described problems of the textbook, difficulties in 
problem solving can be diagnosed by the use of such tests as 
have just been presented. 

Diagnosis in language. The major problem in diagnosing 
learning difficulties in language is the vagueness and uncer¬ 
tainty of objectives. While standardized tests reveal a gen¬ 
eral improvement from grade to grade, many common errors 
of usage which theoretically should have been eradicated 
through proper remedial work in the early elementary grades 
persist through high school and college. Table 19 presents a 
study ^ of the prevalence of error among eighth-grade pupils. 
The error quotient is the percentage of errors actually made in 
relation to the opportunity for making errors. 

* From Walter S. Guiler, “ ImprovlnE Instruction In English Mechanics in the Elementary 
School"; in Elementary School Journal. (February, 1034.) 



Problems 

Part I — Comprehension 

(Read the problem) 

Problem 7 

W^e drove our auto 4 miles to 
the railway station and there 
took a train on which we rode 

3 hours at 4® miles per hour. 
How far did we ride altogether 
that day? 

(Check true statement) 

.... We rode on the train all 
day. 

-We rode only to the station. 

.... We rode 40 miles per hour 
in the auto. 

-We rode only on the train. 

.... We rode much farther on 
the train than in the auto. 

Part 2 — What is Given 

Part 3 — What is Called For 

(Check what is given) 

.... Distance from home to 
station. 

.... Distance we rode on train. 
.... Speed of train. 

.... Time we rode on the train. 
.... Speed of auto. 

(Check what is called for) 

.... Distance we rode on train 
alone. 

.Distance from home to 

station. 

.... Speed of train. 

.... Total distance we rode. 

.... Speed of auto. 

Part 4 — Probable Answer 

Part 5 — Correct Solution 

(Check probable answer) 

.... 132 miles. 

.... About 124 hours. 

.... About 280 miles. 

.... 148 hours, 

.... About 123 miles. 

(C 

:::: 

' 

.... 

Dheck correct solution) 
■40+4-44 

44X 3 = 132 

40 X 3 = 120 

120 + 4 — 124 
' 40 X 3 “ 120 

4 X 4 “ 16 

120 + 16 — 136 

40X 3 120 

144 + 4 -= 148 

U +3 = 7 

40 X 7 = 280 


FIG. 35 . ILLUSTRATION OF DIAGNOSIS OF PROBLEM-SOLVING 
DIFFICULTY 

. From Rucb, Knight, Greene, and Studebaker. Compass Diagnostic Tests In Arithmetic. 
Copyright, 192 S- Chicago, Scott, Foreaman and Company. Used with permlwslon of the 
publlihersL 
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Table 19. Prevalen'CE of Error among Eighth-Grade Pupels 
IN Certain Phases of English Usage ^ 

Usage in Which Error. Was Mahe Error Quotient 

Use of capital letters: 

1. Capital letters for titles used with persons’ names .52 

2. Capital letters for adjectives derived from proper names .62 

3. Capital letters for all important words in a title .66 

4. Small letters for the names of the seasons except when they are 

personified .68 

5. Capital letters for names of particular sections of country .77 
Use of punctuation marks: 

6. Comma to separate the parts of a calendar date .51 

7. Comma to set off the name of a city from the name of a state .67 

8. Quotation marks to inclose a direct quotation .67 

9. Apostrophe to denote possession .69 

10. Comma or commas to set off name of person addressed .75 

11. Commas to set off appositives and parenthetical expressions .77 

12. No quotation marks for an indirect quotation .81 

Use of verbs: 

13. Agreement of verb and subject .53 

14. Use of past participle with "has,” “have,” or “had” .61 

Use of pronouns: 

15. Predicate nominative in the nominative case .40 

16. Correct case forms for the subject and the object of a verb .53 

17. Object of a preposition in the objective case .68 

r8. Agreement of pronoun and antecedent .79 

19. Avoiding indefinite reference for “it” and “they” .96 

Use of adjectives and adverbs: 

20. Avoiding the use of “them” for “these” or “those" to mod¬ 
ify a noun .31 

21. Avoiding the use of adjectives for adverbs and vice versa .79 

Grammatical sentence structure: 

22. Avoiding the use of double negatives .71 

23. Placing a modifier as close as possible to the word it modifies .88 

24. Following correlative conjunctions by sentence elements of 

equal rank .94 

25. Having a sentence containing a participle contain also a noun 

or a pronoun which the participle definitely modifies .96 

* Error-quotient data were taken from the report of the Oliio Survey. 

Regardless of the objectives stressed in a particular grade, 
or the procedures utilized in directing learning, such objectives 
should be made as definite as possible. This is true whether 
they constitute a pre-determined list in a course of study or 
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unit, or are determined from common errors which pupils make 
in speaking and writing. 

Many common errors respond to treatment when they are 
called to the attention of teachers and special remedial activi¬ 
ties are utilized. The present persistence of so many common 
errors suggests that teachers are not aware of these errors, 
hence neither the regular language activities nor remedial situ¬ 
ations are properly planned. 

The first step in diagnosis would be the determination of a 
basic list of objectives to be emphasized from grade to grade. 
Errors are easily identified and require no special test tech¬ 
niques except a check sheet. Each teacher’s check sheet 
should contain the objectives for her grade and all preceding 
grades. Since poor or erroneous usage becomes set by con¬ 
tinued use, all written work, regardless of subject or activity, 
should be checked with these lists. One or two contacts with 
the correct form in a class exercise will not eradicate a habit set 
by fifty experiences in using the wrong form. Although such 
diagnosis would at first appear to be an arduous task, actually 
the present plan trains children in wrong usage. Such a plan 
of diagnosis will not draw attention unduly from thought to 
form, since pupils will write and speak as freely and interest¬ 
ingly as before; but it will serve to keep constantly before 
teachers the particular reasons for directing language activi¬ 
ties. Common errors in written work persist primarily be¬ 
cause teachers do not systematize their diagnosis, refuse to 
accept work with errors, and plan remedial exercises. With 
present knowledge of what errors persist and additional 
knowledge from pre-tests, language activities can be directed 
toward common errors from the beginning, not just during 
remedial activities. 

Virtually the same technique may be utilized in oral lan¬ 
guage, except that check sheets with names of pupils and 
usage-objectives must be substituted for the written work of 
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pupils. Thus diagnosis becomes careful systematic recorded 
observation. In both written and oral types, diagnosis is 
not complete until pupils have been given their personal lists 
of difficulties, and teachers have determined which are general 
difficulties, hence class remedial problems. Consciousness of 
difficulties and of objectives to be achieved is basic to im¬ 
provement. 

While complete mastery of good usage by every pupil would 
be ideal, the experience of expert teachers of language reveals 
what we might predict from our general knowledge of indi¬ 
vidual differences, namely, that at present we cannot achieve 
such a goal. It is advisable, therefore, at least once each se¬ 
mester or term to measure actual achievement by some instru¬ 
ment which will reveal how well the class as a whole is doing in 
relation to expected achievement. 

If in addition to supplying this information the test can at 
the same time be diagnostic, the teacher obtains a wide range 
evaluation of important objectives as well. 

The Los Angeles Diagnostic Test in Language * provides 
both normative information for determining the general status 
of the class and a detailed diagnosis of important learning 
difficulties. Figure 56 presents the Diagnostic Record Sheet 
for this test. 

The number of hours spent marking language papers or oral 
language has little effect on improving usage. Only identify¬ 
ing errors, planning activities which emphasize the correct 
forms, and making pupils conscious of them when necessary in 
order to work co-operatively on both oral and written activi¬ 
ties can hold much promise. 

Objectives in oral as in written language are not confined 
to good usage in the sense of grammatical correctness, but 
involve many other important subsidiary and concomitant 

* Amatrone, Caroline. Loi AnReles Diagnostic Teste; Language. Los Angeles, Calif.. 
California Test Bureau. 
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FIG. 36. DIAGNOSIS OF LANGUAGE DIFFICULTIES 
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aims. Whatever the number or type of objectives adopted by 
a school or teacher, they should appear on a check sheet or 
diagnostic record in order to obtain a continuous diagnosis of 
difficulties and shortages. Among the objectives which should 
appear on such diagnostic record sheets the following are 
illustrative: 

1. A well-placed and modulated voice 

2. Accurate but not exaggerated pronunciation and enunciation 

3. Ease and poise while speaking 

4. Emotional response appropriate to topic of conversation 

5. Graciousness in facing an audience 

6. Ability to tell a story, discuss a problem, or give a book review 

7. Ability to use an outline in speaking 

8. Possession of a sufficient vocabulary to make conversation 
interesting 

9. Ability to break into and contribute to a discussion 

10. Ability to differ without giving offense 

11. Ability to change the topic of conversation 

12. Ability to avoid mannerisms 

While emphasis in learning activities should be on communi¬ 
cation, interchange of ideas, or thinking so that function in¬ 
stead of knowledge only will be stressed, mere activity as 
such establishes bad language habits as well as good. Pupils 
should gradually become critical of form as well as ideas in 
language, so that intelligent, purposeful practice may occur 
when regular learning activities have failed to establish correct 
habits. 

Diagnosis in writing. The problem in the elementary school, 
like that of the primary, consists of two parts: a determination 
of general merit to see if pupils are writing well enough; and a 
diagnosis of learning difficulties for those who are not. Since 
the number of letters is limited and many words recur with 
great frequency, the problem of diagnosis as well as remedial 
work is fairly simple. 
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FIG. 37- SAMPLES FROM THE AYERS SCALE 


Many scales have been devised for general merit, among 
which the best known and most often imitated is the Gettys¬ 
burg Edition of the Ayers Scale,^ samples of which are pre¬ 
sented as Figure 37. * 1 

Rate as well as general merit is important, and Ayers 

standards for rate are presented in Figure 38. 

Materials for determination of quality may be dictated at 
the rate suggested for the different grades. General merit 
scales form a part of the W^est American Handwriting, the 
Minneapolis Self-Corrective, the Detroit, and the Zaner- 
Bloser scales. 

Perhaps the best test of the mastery of good writing habits 
is the actual written work done in connection with other activ¬ 
ities. For this reason, the written work of those adjudged 
unsatisfactory in general merit should be diagnosed frequently 
and the list of difficulties together with appropriate practice 
exercises be given to each pupil. Among the typical difficul- 

- Ayers, Leonard P. Handwriting ScaU, Grades 2 to 8. New York, Russell Sage Founda¬ 
tion. 
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Rate 

80 
76 
72 

Average number of let- 68 
ters written per minute 54 
and average quality of gQ 
writing commonly found gg 
in grades from second to 
eighth. Number of let- 
ters per minute in the 
second grade is 31 , in the 44 
third 44 , in the fourth 40 
55 , in the fifth 64 , in the 36 
sixth 71 , in the seventh 32 
76 , and in the eighth 79 . 23 

34 38 42 46 50 54 58 62 66 

Quality 

FIG. 38. AYERS’ STANDARDS FOR RATE AND QUALITY 

ties in writing will be found those presented in Table 20. (See 
page 180.) 

In using the Gray Score Card, the numerical scoring may be 
ignored if desired, but the difficulties of each pupil writing be¬ 
low a satisfactory standard should be checked as the basis for 
his personal practice list. 

Diagnostic scales of the type presented in Chapter Six are 
most helpful, since pupils can then co-operate in diagnosing 
their own difficulties and require only the verification of the 
teacher and her remedial suggestions in addition to those sup¬ 
plied by the scales themselves. Where sufficient copies of 
general merit and diagnostic scales are not available for each 
pupil, they should be placed at convenient points on tables or 
walls of the classroom so that pupils may use them freely. Di¬ 
agnosis of writing difficulty is not complete until each pupil 
has his personal practice list and knows how to practice intel¬ 
ligently in overcoming his difficulties. Incidentally, poor 
writing position and poor pen position account for many writ- 
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Table 20. Gray’s Score Card for Judging Handwriting 
Pupil.Age-....Date. 


Grade. School.Teacher 



Perfect 

Score for Each Sample (or Pupil) 


Score 

I 2 3 4 S 6 7 8 P II 12 13 14 IS 

I. Heaviness. 

3 


2. Slant. 

Uniformity. 

Mixed. 

5 


3. Size . 

Uniformity. 

Too large. 

Too small. 

7 


4. Alignment. 

8 


5. Spacing of lines.... 

Uniformity. 

Too close. 

Too far apart. 

9 


6. Spacing of words.. . . 

Uniformity. 

Too close. 

Too far apart. 

II 


7. Spacing of letters . . . 

Uniformity. 

Too close. 

Too far apart. 

18 


8. Neatness. 

Blotches. 

Carelessness. 

13 


g. Formation of letters. 

General form. 

Smoothness. 

Letters not closed. . . 

Parts omitted. 

Parts added. 

Total score. 

(26) 

8 

6 

5 

5 

2 

TOO 



ing difficulties; but complete muscular movement is neither 
necessary nor natural. The wrist and the fingers as well as 
the arm play a part in writing. 

Diagnosis in spelling. Differences in school environments 
result in differences in speaking as well as writing vocabularies; 
and spelling lists, where used, differ from one school system 
to another. But such words, once decided upon, are definite 
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and specific objectives, and misspelling can be easily identified. 

As in primary spelling, the pre-test is the basis of diagnosis. 
Pupils should not begin 'the study of a list as such, but should 
study only the words they cannot spell. After the pre-test, 
each pupil should make his own list of study words, and in ad¬ 
dition, the teacher should compile a list of words missed by all 
or most members of the class. 

Among the major causes of misspellings are the following: 

1. Poor vision 

2. Defective hearing 

3. Poor memory, visual or auditory 

4. Poor or slow writing 

5. Carelessness in copying words 

6. Practice exercises not verified 

7. Inability to pronounce words 

8. Spelling and pronunciation not associated 

9. Lack of knowledge of meaning of words 

10. Insufficient study 

11. Knowledge of how to study lacking 

After misspelled words have been identified and studies and 
additional tests have been given, the teacher should attempt to 
determine the exact cause or causes of difficulty as the basis 
for remedial work. 

Misspellings in written work as well as words which have 
given difficulty in previous spelling tests should appear sys¬ 
tematically in review tests. Each pupil’s personal study list 
should contain all current and past words which he has failed 
to master. 

At least once each semester, preferably at the beginning, the 
teacher should give a standardized test in spelling in order to 
aid in determining whether or not the class as a whole is up to 
a reasonable standard. While standardized tests differ some¬ 
what in the words used, pupils need and learn to spell many 
words not used on spelling lists. Spelling tests should be se- 
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The numerals and capital letters in the analysis correspond to the sections of the test similarly marked. An 
inspection of the unsatisfactory responses will reveal whether or not remedial instruction is needed. These topics 
are then checked by the teacher as the basis for a constructive program. 

Once an adequate diagnosis has been made, remedial instruction is frequently a simple matter. However, teach- 
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lected primarily on the basis of their correspondence with the 
word-lists used in a g-iven community, but the presence of other 
useful words in such tests will not seriously reduce test validity. 

Language, writing, and spelling are frequently designated 
as the language arts because of their close relationship. The 
Progressive Diagnostic Language Tests* group them in this 
manner, and provide a total score for comparative purposes 
as well as a diagnosis of difficulties. Figure 39 presents the 
diagnostic analysis of difficulties. 

The teacher completes the analysis only for pupils whose test 
results are significantly below expectation, and crosses off the 
notations as pupils master their difficulties. 

Among the separate spelling scales which may be used are 
the following; the Ayers Spelling Scale, the Buckingham Ex¬ 
tension of the Ayers Scale, Iowa Spelling Scale, the Clark 
Spelling Test, the Morrison-McCall Spelling Scale, the Tidy- 
man Standard Spelling Scale; and the Van Wagenen Spelling 
Scales. 

Diagnosis in the social studies. The major objective of the 
social studies is the development of sane, rich, many-sided, ef¬ 
fective personalities. Adjustment diagnosis which starts with 
specific evidences of failure to achieve such personalities is 
closely related and should be reviewed in this connection. 
While there is general agreement in this and other general 
aims, there are wide differences of opinion and belief in regard 
to the specific objectives through which this general goal may 
be attained. 

There is somewhat general agreement that a rich background 
of facts and knowledges and their implications is basic; these 
are obtained through educational experiences and activities in 
the fields usually designated as geography, history, and civics. 
While from one standpoint such objectives are primarily intel¬ 
lectual, they must nevertheless become the foundation on 

I Tiegs, E. W., and Clark, W. W. Progressive Language Tests. Los Angeles, Calif,, 
California Teat Bureau. 
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which sound reasoning and desirable attitudes, ideals, and ad¬ 
justment are based. Ability to adjust to enlarging groups be¬ 
gins with living together in the classroom and playground. 
The classroom must furnish, as far as possible, the type of so¬ 
ciety in which pupils will participate or hope to participate 
during and after school days. 

It is not, therefore, the primary business of the social studies 
to develop reading, language, or arithmetical skills, even 
though they are improved through use. These skills are the 
means through which social-studies objectives are attained. 
They are no more social studies than science or art skills; they 
are general in nature and aid in the attainment of objectives 
in all fields. 

Regardless of wide differences in objectives and procedures 
for attaining them, it is nevertheless important that whatever 
objectives are accepted be as definite as possible to serve as 
the basis not only for diagnosis but for the selection of ma¬ 
terials and guidance of learning as well. These objectives will 
consist of facts and information, appreciations and under¬ 
standings, reasoning, ideals, attitudes, and conduct or behavior 
patterns. Because of the variability in objectives and grade 
placement of materials, standardized tests tend to be of very 
limited value. For this reason teachers must depend heavily 
on informal objective tests, described in Chapters Three and 
Four. 

Much of the social-studies program is organized in more or 
less well-defined units. Pre-tests, diagnostic tests, and com¬ 
prehensive unit-tests may be utilized to advantage if teachers 
really know what they are attempting to do and express this 
knowledge in tests of important objectives of different types. 

In general, learning difficulties in social studies may be iden¬ 
tified in the following manner: 

I. Facts and information and their implications 

a. Informal objective tests, the items of which represent im¬ 
portant teaching objectives 
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b. Standardized tests, such as the Brown-Woody Civics Test, the 
Burton Civics Test, Branom’s Diagnostic Tests in Geography, 
Witham Geography Tests, the Ely-King Interpretation Tests 
in American History, and the Van Wagenen American History 
Scales. Unless the items of the tests correspond fairly closely 
with the objectives of the units studied, the norms are of no 
particular value. 

But carefully standardized tests frequently provide a wide 
sampling of important objectives, hence learning difficulties 
may be detected by studying the individual items. 

2. Skills in tools of learning and reasoning 
a. Informal objective tests 

h. Standardized tests in various skills. Often social-studies out¬ 
comes are not attained because students cannot read, or cannot 
handle quantitative data. 

3. Appreciations, attitudes, and ideals 
a. Informal objective tests 

&. Observation and recording of evidences 

c. Standardized tests of the Tomlin Best Things To Do test, 
Lehman’s Play Quiz, the Association Press Personal Attitudes 
Test, the Baker TelUng What I Do test, the Character Educa¬ 
tion Inquiry Tests of Honesty^ and the like 

4. Conduct and habits 

a. Systematic observation and recording of actual conduct and 
habits 

h. Standardized scales like the Haggerty-Olson-Wickman Be¬ 
havior Rating Schedules 

Diagnosis in music. Where work in music is well organized, 
each pupil knows what is expected of him and does his part as 
far as he is able. Since a major part of the activity is singing, 
much of the diagnosis of difficulty here, as on the primary 
level, must be done while the class is in action. However, 
there are two well-known tests of musical aptitude, one of 
which should be given to all pupils at least every two years, 
and at such other times as serious questions arise: 

1. The Seashore Measures of Musical Talent 

2. The Kwalwasser-Dykema Music Tests 
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Both of these tests depend upon hearing to evaluate such 
factors as tonal memory, intensity, pitch, and the like; when 
results are unsatisfactory, they should be supplemented by 
regular tests of hearing. 

Knowledge about music, including such factors as syllables, 
time, note values, breathing, and pronunciation, may be tested 
in regular objective tests, the whole class being tested in the 
time it would take to make one individual test. These require 
both visual and auditory acuity, and in case of persistent dif¬ 
ficulty should be supplemented with tests of vision and hearing. 

Many objectives can be attained only through active partic¬ 
ipation, and it is here that the teacher may use the diagnostic 
record sheet to advantage. Among the items which should 
appear on such a sheet are the following: 

1. Pleasure in singing and playing 

2. Ability to use smooth pleasing tones in singing 

3. Ability to remember and sing previously learned songs 

4. Ability to remember and play selections previously learned 

5. Ability to read syllables at sight 

6. Ability to read notes at sight 

7. Ability to maintain correct time 

8. Ability to sing and play each note at the correct pitch 

While the class is at work, the teacher passes among pupils 
(difficulties are usually located near the front seats when seat¬ 
ing is correct) detecting the evidences of difficulty. If a pupil 
has difficulty singing in time she may tap on his shoulder until 
he corrects it; if he is off pitch, she sings the correct note; when 
necessary she gives verbal directions. All of these difficulties 
may be diagnosed while singing or playing proceeds, and it is 
rarely necessary to stop the work. 

Individual learning difficulties are revealed in individual 
work, some of which should form a part of every well-organized 
music program. In this case, the class may aid in diagnosis 
by repeating a phrase correctly as soon as an error is made. 
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Incidentally, this acts as an automatic class diagnosis as well. 

Sometimes it is desirable to make more detailed individual 
diagnoses, in which case a device ^ like that illustrated in. 
Figure 40 may be used. 

1. Copy one to three songs which the pupil should be able to sing well. 

2. Draw lines under the staff to represent various factors which may 
be checked as the individual singing proceeds. 

3. Have the pupil, reading from his music book, sing the song. 

4. As the singing proceeds, the teacher checks the errors on appro¬ 
priate lines below the points at which they occur. 

5. The pupil repeats the song until the diagnosis is complete. 


From time to time it is desirable to determine the status of 
certain outcomes which can be measured by regular standard¬ 
ized tests to see how well each member of the class and the class 
as a whole is progressing in relation to expectations. 

Among these outcomes are knowledge of music symbols, key 
signatures, note values, rest values, compositions, composers. 



1. Syllables. 

2. Time values... 

3. Notes. 

4. Tone. 

5 . Breathing. 

6. Pronunciation. 

7 - Song. 

FIG. 40. ILLUSTRATION OF INDIVIDUAL DIAGNOSTIC MUSIC TEST 

• Tiega, E. W. The Management of Learning in the Elementary Schools, p. 224. New 
York, Longmans, Green and Company, 1937. 
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Test 7. Knowledge of Key Signatures 

Directions: At the left below is a column of ten major key signa¬ 
tures. At the right is a column of five minor key signatures. You are 
to write the names of the keys on the lines at the right of each signature. 

Notice that there are two columns, one for major keys and one foi 
minor. 


Samples: — ■ .JD.jfJat.,. 


Begin here. 


MAJOR Key Signatures Minor Key Signatures 



FIG. 41. SAMPLE OF TEST OF KEY SIGNATURES 


attitudes, and ideals. Figure 41, for example, is an excerpt 
from Test 7 of the Kwalwasser-Ruch * Test of Musical Ac¬ 
complishment. An analysis of such tests, even though not 
primarily set up for diagnosis, may nevertheless yield diagnos¬ 
tic information and serve as the point of departure for remedial 
activity. 

Among other tests which are useful in this connection are the 
Guildersleeve Musical Achievement Test, Hutchinson Music 
Tests, Hilbrand Sight Singing Tests, the Torgerson-Fahne¬ 
stock Music Test, and the Providence Inventory Test in 
Music. 

Diagnosis in art. The fundamental problem in diagnosing 

* Kwalwasser, Jacob, and Ruch, G. M. Test of Musical Accomplishment. lo'wa City, 
la., 1924. Published by the Extension Division, University of Iowa. 
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learning difficulties in this field has been the vagueness and 
variety of objectives. Teachers continue to work with con¬ 
struction, modeling, crayon, brush, and pencil, but are not 
aware of specific outcomes to be attained by any one of them. 
Some programs appear to be almost entirely drawing; others 
emphasize picture study or some other one aspect. Diagnosis 
of difficulties must proceed in terms of the particular activities 
which represent art to each teacher, but there is a constant 
tendency to avoid being definite and specific about objectives 
and to depend upon general impressions. 

Art requires good vision, memory, spacial and visual ability, 
and eye-hand co-ordination. In case of persistent difficulties in 
drawing, construction, or the like, it is well to repeat tests of 
these abilities which have been previously presented. Some 
investigators report a high correlation between I.Q. and art 
ability, but owing to the uncertainty about what constitutes 
art, and the low correlations elsewhere, the assertion is doubt¬ 
ful. 

Art activities should not be all representative. Apprecia¬ 
tion, knowledges, and information about art and art products, 
facts about media and how to handle them, art principles in 
relation to color, line, and design should be specific and definite. 
Success or failure should be carefully checked and recorded and 
evaluated through informal tests and standardized tests to the 
extent available. 

The evaluation of skill, for example, cannot be done in a 
general vague way, but should include the following among 
other factors: 

1. Skill in using crayon, pencil, or pen 

2. Skill in using the brush 

3. Skill in using charcoal 

4. Skill in handling and modeling clay 

5. Skill in different types of carving 

6. Skill in design 

7. Skill in discrimination 
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Kach of these factors needs to be broken down into still 
more specific elements if diagnosis is to mean anything to stu¬ 
dents and be helpful in guiding remedial work. In the same 
manner appreciation and knowledge do not mean appreciation 
or knowledge generally, but about many specific important 
concepts and facts related to art. For example, a test on color 
may contain such items as the following; 

1. Red is a (i) warm, (2) cool color . 

2. A large amount of red with a small amount of blue 

makes a (i) warm, (2) cool color . 

3. Red and blue properly mixed make which of the 

following colors? (i) green, (2) yellow, (3) violet, 

(4) magenta, (5) brown . 

4. To gray blue use (i) orange, (2) yellow . 

5. To gray red use (l) black, (2) brown, (3) turquoise 

blue . 

A number of workers have attempted to identify some of the 
important specific objectives and have constructed tests. As 
in other fields where uncertainty of objectives exists, such tests 
may be of more value in a particular community for diagnosis 
than for the normative purpose for which they were developed. 
Dr. Lewerenz,* for example, has broken down the fundamental 
abilities of visual art into nine more specific abilities, including 
the following: 

1. Recognition of proportion 

2. Originality of line drawing 

3. Observation of light and shade 

4. Knowledge of subject-matter 

5. Visual memory of proportion 

6. Analysis of cylindrical perspective 

7. Analysis of parallel perspective 

8. Analysis of angular perspective 

9. Recognition of color 

Observation of light and shade is presented as Figure 42. 

* Lewerenz, A. S. Test in the Fundamental Abilities of Visual Art. Los Angeles, Calif., 
California Test Bureau. 
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A number of drawing scales, devised to determine how well 
children can draw, are more useful as models of what to avoid 
and what to imitate. 

McAdory ^ has designed a test dealing with judgments of the 
artistic merit of articles of furniture, utensils, textiles, clothing, 
and so forth, which is highly reliable. Meier and Seashore ^ 
have standardized 125 pairs of pictures; the subject judges 
which are more pleasing, satisfying, or more artistic. Such a 
test will suggest many of the specific objectives toward which 
learning in art should be consciously directed and in terms of 
which difficulties may be diagnosed. 

Knauber ^ has developed art-ability tests for junior and 
senior high schools. The following items were taken from the 
test of art vocabulary: 

1. What is a tint? 

a. A shaded effect 

b. A kind of ink 

c. A light value of a color 

d. Abbreviation for tintype 

e. A shade of green 

2. What is a contour? 

a. Smooth surface 

h. Subject-matter 

c. Reverse pattern 

d. Outline of a mass 

e. Plan of a drawing 

3. What is technique? 

a. Manner of execution 

b. Fine finish 

c. Method of advance 

d. Manner of construction 

e. Color combinations 

» McAdory, Margaret. Art Test published by Bureau of Publications, Teachers College. 
Columbia University, New York. 

» Meier, N. C., and Seashore, C. E. The Meier-Seashore Art Judgment Test. Iowa City, 
la. Bureau of Educational Research and Service, University of Iowa. ipag. 

J Knauber, A J. Knauber Art Vocabulary Test. Cincinnati, O., A J. Knauber. 
193a. 
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Diagnosis in industrial and household arts. These fields are 
closely related to both art and the social studies, some of the 
Q'bjectives appearing in all. However, much of the work 

jjj^jus-trial arts is now organized in units and the same 
techniques may be used in diagnosis of learning difficulties 
with facts, information, understandings, and attitudes as were 
presented in connection with social-studies units. Household 
arts in the elementary school are still largely confined to def¬ 
inite specific tasks of doing or making, and diagnosis of learn¬ 
ing difficulty proceeds both while work is in progress and by the 
use of various product scales. For example, the Murdoch 
Sewing Scale consists of photographs showing samples of hem¬ 
ming, basting, overcasting, seams, and the like. Both pupils 
and teacher compare pupil products with the scale for diagnos¬ 
tic and appraisal purposes. 

Among the objectives in industrial arts are the following: 

1. The sources of raw materials 

2. Types and qualities of raw materials * 

3. How things are made 

4. How things are cared for and repaired 

5. Sizes and grades of manufactured products 

6. Distribution of products 

7. Factory methods, opportunities, and working conditions 

Diagnosis with informal objective tests is most satisfactory 
since courses are not yet well standardized. Aims also include 
many specific objectives in buying and caring for articles, 
ability to plan and draw plans of articles to be made, ability 
to make and repair articles, skill in the use and care of tools, 
skill in making repairs, appreciations of the problems of em¬ 
ployers and employees, and desirable attitudes and ideals based 
on a knowledge of facts. 

Important specific objectives should be determined for each 
of the major aims, and diagnosis of learning difficulties be made 
through careful systematic observation of activities and 
through regular informal objective tests. 
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Tests need not be of the pencil-and-paper variety to be ob¬ 
jective and reliable. The square is one of the most usable and 
accurate tests in industrial arts. Color charts and samples of 
finishes of various degrees of merit constitute product scales 
for determining the extent to which various activities are sat¬ 
isfactory, and are used as the point of departure in analyzing 
the causes of unsatisfactory work. 

In spite of the variety of programs and differences in the ob¬ 
jectives consciously pursued, workers have attempted to iden¬ 
tify important objectives which should be standard in the in¬ 
dustrial- and home-arts programs. While the final proof in 
cooking has always been the eating, those who cook best do not 
do so accidentally, but partially because they possess certain 
essential facts and understandings. Test items * like those 
which follow, for example, reveal the probable success of stu¬ 
dents in cooking, or may reveal the causes of bad cooking. 

17. To make muffins light and of good texture: 

a. sift the dry ingredients. Beat eggs, add them to milk. Add 
dry ingredients gradually to liquid, beating thoroughly. Bake 
in hot oven. 

b. sift dry ingredients. Beat eggs, add them to milk. Add dry 
ingredients to liquid, stirring only enough to mix, bake in 
moderate oven. 

c. sift diy^ ingredients. Beat eggs, add to milk, pour liquid into 
dry ingredients, stirring only enough to mix, bake in hot oven. 

18. In order to have a steak, chop, or roast brown and juicy: 

a. start with intense heat, decrease after it is well seared. 

h. start with very little heat and increase gradually till brown. 

c. cook the entire time with moderate heat. 

Difficulties in handling tools, drawing, and doing activities in 
general may be due to lack of motor co-ordination or mechani¬ 
cal ability, and may be determined by the use of such a test as 
the MacQuarrie Test of Mechanical Ability, a sample of which 
is presented in Figure 43. Special tests in drawing, metal work, 

* Streeter, Nina, and Trilling, Mabel. Foods Freparation Test. Bloomington, Ill., Public 
School Publishing Company, ipa?. 
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printing, and the like have been devised on the upper elemen- 
and secondary level, but for the most part their purpose 
is appraisal at the end of the course rather than diagnosis. 
Instructors should list the important objectives of each kind 
which they expect pupils to attain, and detect shortages 
through check list and objective test techniques. 



FIG. 43. SAMPLE FROM MACQUARRIE TEST OF 
MECHANICAL ABILITY 

From MacQuarrie, T. W., Test of Mechanical Ability. Los Angeles, Calif,, California 
Test Bureau. 
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The fruits of diagnosis are frequently lost when it is stopped 
at the mere location of a difficulty. Pupils should understand 
clearly the nature and causes of the difficulty, take notes 
on remedial work, and in many cases receive a follow-up diag¬ 
nosis. 

Diagn.osis in elementary science. Science experiences and 
activities are coming to occupy an increasingly important place 
during the period of elementary education. Whereas recent 
developments in the social studies have added certain impor¬ 
tant outcomes in terms of socialization and adjustment, science 
assists in eradicating superstitions, gullibility, and credulity, 
and supplanting them with a host of definite principles, gen¬ 
eralizations, cause-and-effect relationships, and thinking habits 
which are just as essential to the development of many-sided, 
well-balanced, effective personalities. 

Whereas certain science concepts and objectives appear in 
the early social-studies units, the importance of science and 
scientific thinking in modern life has justified, in the minds of 
many, a continuous, systematic program of science education 
from the time the child enters school, rather than a dependence 
upon social-studies units organized for the most part to attain 
other outcomes. Courses in elementary science exhibit a 
marked trend toward units built around important principles 
and generalizations.^ Techniques of diagnosis should be 
similar to those used in connection with social-studies or in¬ 
dustrial-arts units, where important objectives related to each 
principle or generalization are first carefully determined, and 
pre-tests and diagnostic tests freely employed to guide learning 
activities. 

Among the types of objectives which should be attained and 
should appear in diagnostic tests are the following: 

* See Tiegs, E. W. The Management of Learning in the Elementary Schools. Chapter XIII. 
New York, Longmans, Green and Company, 1937- Also “A Program for Teaching Sci¬ 
ence”; in Thirty-First Yearbook, National Society for the Study of Education. Part I, Chap¬ 
ter X. Bloomington, Ill., Public School Publishing Company, 1932. 
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1. A wide variety of important and interesting facts observable in 
everyday life 

2. Principles and generalizations related to these facts 

3. Appreciations and understandings of nature and science 

4. Scientific attitudes and ideals 

5. Habits and techniques of scientific thinking 

6. Science interests 

7. Scientific habits of work 

In breaking down scientific attitudes, for example, into more 
specific objectives for learning and as the basis of diagnosing 
learning difficulties, the following classification adapted from 
Curtis * and Craig is suggested: 

1. Development of cause-and-effect convictions resulting in eradica¬ 
tion of attitudes characterized by 

a. Superstitions 

b. Credulity 

c. Unexplainable mysteries, etc, 

2. Suspended judgment and delayed conclusions which provide op¬ 
portunity for adequate thinking 

3. Careful, painstaking searching for reasons for observable phe¬ 
nomena 

4. Formulating hypotheses for explanation of observed phenomena 

5. Weighing observable evidence, and consciousness that other evi¬ 
dence may be available 

6. Open-mindedness, and willingness to face facts 

7. Consciousness that man’s knowledge of truth is relative; what he 
regards as truth today may be error tomorrow, and persist as 
superstition five decades hence 

8. Willingness to depend on causal explanation even when evidence 
is not available 

Each of these attitudes will be broken down into additional 
specifics when applied to a given problem. The basic tenet of 
diagnosis, here as elsewhere, is the determination of the im- 

I Curtia, F. D. Some Values Derived from Extensive Reading in General Science. Teachera 
College Contributiona to Education, no. 163, 1924, 

» Craig, G. S. Certain Techniques Used, in Developing a Course of Study in Science for the 
Horace Mann Elementary School, Teachera College Contributiona to Education, no. 276, 

1927. 
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portant specific outcomes (here in terms of attitudes) which 
pupils are expected to attain, and the diagnosis of learning dif¬ 
ficulties in terms of these specifics. 

Among the standardized tests available for diagnostic pur¬ 
poses, the Calvert General Science Information Tests ^ are 
useful in testing a wide variety of interesting and important 
facts. Figure 44 presents the diagnostic profile for the ele¬ 
mentary test, grades 4, 5, and 6. By combining pupil scores 
and comparing relative attainment in the four major fields, 
the teacher knows what to stress for the class as a whole and 
for any pupil. The Analytical Scales in Elementary Science, 
Iowa Every-Pupil Test in General Science, and the Modern 
School Achievement Elementary Science Tests are examples of 
carefully developed lists of important objectives. While, in 
general, elementary science tests have been developed for ap¬ 
praisal purposes, a study of pupil failures in the original test 
and detailed work in analyzing them will afford diagnostic 
guidance for teaching. 


QUESTIONS AND EXERCISES 

1. If possible, select a pupil failing in arithmetic, make a complete 
diagnosis (including written and oral if necessary), and outline a 
remedial program which you believe would be effective. 

2. Outline important procedures used in diagnosing learning diffi¬ 
culties in language, writing, and spelling. Discuss them with 
teachers in the schools and report any modifications you think 
desirable. 

3. Discuss the problem of diagnosis in the social studies and show its 
relation to techniques described earlier. 

4. Describe the diagnosis of learning difficulties in music. What ob¬ 
jectives are illusive and usually treated by the general impression 
method? Can you suggest more objective and reliable techniques? 

» Calvert, Everett. General Science Information Tests. Loa Angeles, Calif., California 

Test Bureau, 1937. 




Grade Placement 

FIG. 44. DIAGNOSTIC PROFILE FROM CALVERT GENERAL SCIENCE INFORMATION TESTS 
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. complete a classification as you can of the difficult types of 
es in art. W^hich of them can be identified with diagnostic 
les suggested? Can you suggest methods for dealing with 

i complete set of diagnostic procedures for detecting learning 
es in the various objectives in industrial arts. Discuss the 
roduct scales in household arts. 

the contribution of the Calvert General Science Information 
the diagnosis of learning difficulties in science. 

L diagnostic test for scientific attitudes. 

ORKING BIBLIOGRAPHY OF TESTS 

hundreds of tests are now available in the various fields on 
tary level. The following is only a restricted sampling, 
lustive list, turn to bibliographies such as those of Buros and 
Publishers as well as authors are glad to furnish information 
he nature and recency of test revisions as well els new tests 
efit of those interested in selecting tests. 

3 adge Test for Boys anjd Gires, Elementary Schools. 

PLAYGROUND AND RECREATION ASSOCIATION OF AMERICA. 

's: Department of Interior, Office of Education, Washington. 

Writing and (2) Spelling Scales, Grades 2 to 8. 

AYERS, L. p. 

•-s: Russell Sage Foundation, New York. 
ooDY Civics Test. 

BROWN, A. W., AND WOODY, C, 

rs: World Book Company, Yonkers-on-Hudson, N.Y. 

AM Extension of the Ayers Spelling Scale, Grades 2 

BUCKINGHAM, B. R. 

rs: Public School Publishing Company, Bloomington, Ill, 

I vies Test, Grades 5 to 9. 
burton, w. h. 

rs: World Book Company, Yonkers-on-Hudson, N.Y 
A Test of Personality, Grades 4 to 9. 

THORPE, L. P., CLARK, W. W., AND TIEGS, E. W. 
rs: California Test Bureau, Los Angeles. 
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CaIvIFOknia Test of Mental Maturity, Grades 4 to 8, 

Authors: sullivan, e. t., clark, w. w,, and tiegs, e. w. 

Publishers: California Test Bureau, Los Angeles. 

Charters Diagnostic Language Test, Grades 3 to 8, 

Author: charters, w. w. 

Publishers: Public School Publishing Company, Bloomington, Ill. 

Clapp-Young Arithmetic Test, Grades 5 to 8. 

Authors: clapp, f. l., and young, r. v. 

Publishers: Houghton Mifflin Company, Boston. 

Clapp-Young Self-Marking English Test, Grades 5 to 12. 
Authors: clapp, f. l., and young, r. v. 

Publishers: Houghton Mifflin Company, Boston. 

Compass (i) Survey and (2) Diagnostic Tests in Arithmetic, 
Grades 2 to 8. 

Authors: greene, H. a., ruch, g. m., and others. 

Publishers: Scott, Foresman and Company, Chicago. 

Guildersleeve Musical Achievement Test, Grades 4 to 8. 
Author: guildersleeve, g. 

Publishers: Bureau of Publications, Teachers College, Columbia 
University, New York. 

Henmon-Nelson Test of Mental Ability, Grades 3 to 8. 

Authors: henmon, v. a. c., and nelson, m, j. 

Publishers: Houghton Mifflin Company, Boston. 

Iowa Every-Pupil Tests, Silent Reading, Language, Arithmetic, 
etc., Grades 6, 7, and 8. 

Authors: horn, Ernest, and mcbroom, Maude. 

Publishers: Bureau of Educational Research, University of Iowa, 
Iowa City, la. 

Kuhlmann-Anderson Test of General Intelligence, Pre-School 
TO Adult. 

Authors: kuhlmann, f., and anderson, r. 

Publishers: Educational Test Bureau, Minneapolis, Minn. 

Kwalwasser—Ruch Test of Musical Accomplishment, Grades 
4 to 12. 

Authors: kwalwasser, j., and ruch, g. m. 

Publishers: Bureau of Educational Research and Service, University 
of Iowa, Iowa City, la. 
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Lewerenz Test in Fundamental Abilities in Visual Art, Grades 
3 TO 12. 

Author: lewerenz, a. s. 

Publishers: California Test Bureau, Los Angeles, Calif. 

Modern School Achievement Tests, Grades 2 to 8. 

Authors: gates, a. i., and others. 

Publishers: Bureau of Publications, Teachers College, Columbia 
University, New York. 

Nelson Silent Reading Test, Grades 3 to 8. 

Author: nelson, m. j. 

Publishers: Houghton Mifflin Company, Boston. 

New Stanford Achievement Tests, Grades 2 to 9. 

Authors: kelley, t. l., and others. 

Publishers: World Book Company, Yonkers-on-Hudson, N.Y. 

Progressive Achievement Tests, Grades 4 to 6, and 7 to 9. 
Authors: tiegs, e. w., and clare, w. w. 

Publishers: California Test Bureau, Los Angeles. 

Public School Achievement Tests, Grades 3 to 8. 

Authors: Orleans, j. s., and torgerson, t. l. 

Publishers: Public School Publishing Company, Bloomington, Ill. 

Safety Education Tests, Elementary Grades. 

Author: stack, h. 

Publishers: National Safety Council, New York. 

Sangren-Woody Reading Test, Grades 4 to 8. 

Authors: sangren, p. v., and woody, Clifford. 

Publishers: World Book Company, Yonkers-on-Hudson, N.Y. 

Seashore Measures of Musical Talent, Grades 2 to Adult. 
Author: seashore, c. e. 

Publishers: Columbia Phonograph Company, New York. 

Unit Scales of Attainment, Grades i to 9. 

Authors: branom, m. e,, and others. 

Publishers: Educational Test Bureau, Minneapolis, Minn. 

Van Wagenen English Composition Scales, Grade 2 Through 
College. 

Author: van wagenen, m. J. 

Publishers: World Book Company, Yonkers-on-Hudson, N.Y. 
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Wilson Language Ereor Test, Grades 3 to 12. 

Author: WILSON, G. M. 

Publishers: World Book Company, Yonkers-on-Hudson, N.Y. 
WiTHAM Tests in (i) Geography and (2) History, Grades 5 to 

9- 

Author: witham, e. c. 

Publishers: California Test Bureau, Los Angeles, Calif. 
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CHAPTER NINE 


DIAGNOSIS AND APPRAISAL IN 
SECONDARY AND HIGHER 
EDUCATION 


Subject-matter specialization, teachers’ lack of training 
in adequate determination of objectives, and their lack of 
ability to identify desirable outcomes constitute three of the 
most serious obstacles to learning on the secondary and higher 
levels. From one standpoint, the measurement of outcomes 
is greatly simplified when emphasis is upon subjects, but con¬ 
fusion regarding the nature and functions of measurement de¬ 
lays progress. 

Often each of a number of different instructors possesses 
significant information about students which, if in the posses¬ 
sion of all, would make more intelligent diagnosis possible. 
In general custom and tradition demand that most students 
be graduated. With little knowledge of diagnosis on the part 
of the teacher, and traditional methods of appraisal still in 
general use, there is no assurance that the objectives of the 
various high-school curricula have been attained. Most col¬ 
lege instructors use equally inadequate methods of diagnosis 
and appraisal and never discover the actual status of their own 
students, hence the large number of uneducated college 
graduates. 

Teacher-training and attitudes. The physician spvends six 
or seven years preparing to diagnose and prescribe, and his 
subsequent practice is often restricted to a limited portion of 
the human anatomy. Most high-school and college teachers 
have not had six or seven months of preparation in the diagno- 
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sis of learning difficulties or the appraisal of outcomes; yet it 
is essential that they deal with the diagnosis of the much more 
intangible difficulties of the whole student — physical, mental, 
emotional, and intellectual — if learning is to be adequately 
directed- Physicians are trained from the beginning that 
diagnosis of causes is the first step in handling a health diffi¬ 
culty, yet teachers almost universally meet learning difficulties 
by merely repeating activities which have already failed. In 
difficult cases the physician makes a whole series of tests, many 
of them expensive, and then brings all information together in 
order to obtain a more complete picture; but the teacher who 
spends hours finding just the right hat frequently criticizes 
tests because a half-hour achievement examination, costing 
but a few cents, does not furnish the final answers on student 
ability, skills, physical condition, emotional stability, and 
other problems related to learning and adjustment. Diagno¬ 
sis is no push-button procedure. Teachers must learn to ob¬ 
tain, assemble, organize, and interpret diagnostic data. 

Secondary-school instructors are now generally required to 
take a certain number of units in Education in preparing to 
teach, but the objectives to be attained are often vague, spe¬ 
cific courses are frequently not even designated, or the grouping 
of courses favors the specialties of the curriculum makers. 
The result has too often been an assortment of fragmentary and 
relatively useless facts about the history of education and some 
interesting speculative ideas about educational philosophy, 
with little evidence of skill in directing educational activities 
or appraising educational outcomes, the purpose for which they 
are being trained primarily. In general, college teachers are 
employed on the basis of subject mastery, and for the most 
part need give no adequate evidence of competence in the art 
or science of teaching. Well-trained modern teachers fre¬ 
quently are at a loss to understand the procedures which pass 
for teaching in college and university classrooms. 
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It is not at all strange, therefore, that instead of utilizing 
the results of research and the techniques of diagnosis and ap¬ 
praisal which have been developed during the past two decades, 
so many high-school and college teachers are only now discuss¬ 
ing the possibility and desirability of diagnosis and appraisal. 
They are debating whether or not tests can measure the out¬ 
comes of their courses, and are entirely oblivious of the fact 
that techniques for diagnosis and appraisal have far outstripped 
their knowledge of their own objectives and outcomes. The 
problem is not primarily one of devising additional measure¬ 
ment techniques, but of keeping high-school and college teach' 
ers abreast of the times in learning to utilize techniques already 
available. Many recommended high-school students are 
totally unprepared for college work; many college teachers 
“pass” students without any attempt at diagnosis or ap¬ 
praisal, mere registration in and attendance at courses being 
sufficient for credit. 

The present major problem in teacher-training is to demon¬ 
strate the wastefulness and uselessness of the methods to which 
so many of them cling so tenaciously; to have them under¬ 
stand that the use of better diagnostic and appraisal techniques 
now available will enable them to do more easily and more ef¬ 
fectively what they now attempt to do by arduous, crude, 
time-consuming methods, and that the major purpose of better 
diagnosis and appraisal is not to fail or eliminate more students 
but to detect their difficulties and render necessary aid in order 
that promotion, credits, and graduation may be justified on 
the basis of actual achievement. 

Mental maturity. The necessity and desirability of a knowl¬ 
edge of mental maturity of students has been considered in 
several other connections. Little remains to be said except 
that in general the attainment of increasingly complex and 
difficult objectives of succeeding levels of learning requires in¬ 
creasingly higher levels and patterns of ability. For this rea- 
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son, all entering high-school students should be given at least 
one test, followed by one or two checking tests for students 
who do not adjust normally. Unless accurate data are avail¬ 
able, students who have serious learning difficulties and are 
social problems, or who in other ways fail to make normal 
progress, should be retested. Colleges and universities now 
generally give intelligence tests or psychological examinations 
to all entering freshmen, but retesting is rarely practiced. 

In administering such tests on both secondary and higher 
levels, the examiner must be sure that the student is suffering 
no visual or auditory defects which will prevent him from doing 
himself j ustice. He must be sure that other conditions for tak¬ 
ing the examination are favorable; and he must be aware of the 
fact that some students earning the highest and lowest scores 
would obtain lower and higher scores if the test were repeated, 
hence the desirability of having at least two test scores for 
students experiencing learning difficulties. It is just as desira¬ 
ble on the secondary and higher levels, as elsewhere, that tests 
used yield not only a single measure of maturity, but data 
which will guide teachers in solving student learning difficul¬ 
ties. In general, all students on these levels should take group 
tests, but rechecks should be given individually. The Stan- 
ford-Binet, for example, is just as usable here as in the primary 
or elementary grades, measuring from two years to superior 
adult. The California Tests of Mental Maturity are useful on 
the high-school level where teachers are learning to use intelli¬ 
gence-test data, because it yields separate measure of memory, 
spacial or visual ability, logical and mathematical reasoning, 
and vocabulary or usable concepts. In addition, its pre-tests 
reveal students with eye, ear, or co-ordination difficulties with¬ 
out requiring separate tests of this type. Its range is to thirty- 
two years of mental age, a level attained only by the most 
superior adults. 

Both of these tests are described at some length in previous 
chapters. 



DIAGNOSIS IN EDUCATION 


209 


The Thorndike Intelligence Examination, the Otis Self- 
Administering Tests, and the Thurstone Psychological Ex¬ 
amination are among those most often used with high-school 
graduates. 

Adjustment diagnosis. Most students live with their par¬ 
ents during primary and elementary years. Although a con¬ 
siderable number attend secondary schools at some distance, 
a much larger proportion of students travel long distances and 
enter an entirely new environment when they attend college 
and university. New surroundings, the necessity for earning 
a part of their expenses, and the gap between high-school and 
college activities produce many problems and tensions. Fre¬ 
quently they neglect their work for two or three weeks, take in¬ 
sufficient rest, and find themselves in difficulty before their 
college careers are launched. 

Lewerenz-Steinmetz Orientation Test. A part of the dif¬ 
ficulty is with the secondary school. Many students finish 
high school naive and gullible, still holding to superstitions 
which should have disappeared as the result of their high-school 
training, whether or not they planned to go to college. Where¬ 
as tests of mental maturity reveal the varying nature of ca¬ 
pacities to achieve, and achievement tests reveal actual 
achievement of certain objectives, a special type of examina¬ 
tion is needed to detect the attainment of desirable outcomes 
not appraised by intelligence or ordinary achievement tests. 
The Lewerenz-Steinmetz Orientation Test * was designed to 
meet such a need. The following are sample items: 

( ) A copper wire fastened securely about the body will prevent 

rheumatism. 

( ) A boil cleans out the blood of a person and tones up his system. 

( ) The face shows how intelligent a person is. 

( ) A severe mental shock causes a man’s hair to turn white over¬ 

night. 

‘ Lewerenz, A, S., and Steinmetz, H. C. Orientation Test . Loa Angeles, Calif,, Cali¬ 
fornia Test Bureau. Revised, 1935. 
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( ) It is a big aid to health to say each morning, "Day by day in 

every way I am getting better and better.” 

( ) Laughing at teachers’ jokes helps one to get good marks in 

school. 

( ) The boy who regularly stands at the foot of his class is often a 

great success after leaving school. 

( ) A snake never dies until after the sun goes down. 

( ) Porcupines fight by shooting their sharp spines at their enemies. 

( ) In divorce the man is always most to blame. 

( ) The special abilities of a man may be determined by the shape 

of his skull. 

( ) All Scotchmen are skillful business men. 

( ) The cartoon figure of “John Bull” looks like the typical Eng¬ 

lishman. 

( ) Handsome men are less dependable than homely men. 

( ) Anyone would steal a million dollars if he were sure of never 

being caught. 

Such a test reveals not only lack of individual growth, but, 
when test results for a class are grouped, yields information 
regarding the success with which the social and natural sciences 
are functioning. 

Adjustment-mechanisms and detours. Some high-school 
and many college students work under special handicaps. 
Some are too ambitious for their physical strength. Others 
lack capacity or have not properly utilized their capacity to 
prepare for more advanced work. Some can attend school 
only through the sacrifice of others, and continuous awareness 
of this fact puts them under additional pressure to do well. 
Others work several hours each day to earn money to remain 
in school. Most high-school and college students also try to 
do too many things and are almost constantly active. Many 
of them gradually become fatigued as the semester goes on, and 
a thoroughly tired person can manifest many of the symptoms 
of insanity. The development of a normal healthy personality 
requires not mere ability, but activity which is reasonably 
agreeable and in which the student enjoys a certain amount of 
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success. No individual will develop a normal outlook on life 
if his work is disagreeable, if he questions its worth, or if he is 
largely unsuccessful. As a result of these and other conditions, 
some of which are hereditary, many students face problems 
and mental conflicts which cannot be solved directly, hence 
they tend to make a greater use of detours or adjustment- 
mechanisms commonly used by everyone. Some of these are 
normal, but others are undesirable and may lead to disaster. 
It is important, therefore, that teachers learn to recognize and 
interpret evidences of these mechanisms. Among the more 
common mechanisms used when students are thwarted in sat¬ 
isfying desires related to clothes, food, marks, social life, and 
the like are the following: ^ 

1. Compensation. Many individuals ‘‘ plunge forward into some form 
of activity” related to a real or imaginary defeat or defect, in 
order to reduce the tensions built up; others become more active 
in other areas where they can succeed (transferred compensa¬ 
tion). The individual either achieves his original goal or a de¬ 
sirable substitute for it and makes a satisfactory adjustment. 
Demosthenes is frequently cited as an outstanding case of com¬ 
pensation of the first type, because he overcame his stuttering 
and became the greatest orator of Greece. 

2. Rationalization. As in the case of compensation the individual 
may make an adjustment satisfactory to himself through ra¬ 
tionalization, by giving acceptable reasons for his status; by 
insisting that he doesn’t desire the thing in which he was de¬ 
feated ; or by saying he achieved exactly what he sought. Where 
his disappointment is too great or lasting, however, he may 
develop an illness (neurosis) as an alibi. 

3. Sublimation. In this mechanism the defeated individual sub¬ 
stitutes a more highly socially approved activity. Women dis¬ 
appointed in love frequently go into nursing or Sunday school 
work when they have not been previously interested in these 
fields. This is a satisfactory solution for many. 

* See Seashore, R. H., and Katz, Barney, "An Operational Definition and Classification 
of Mental Mechanisms"; in Psychological Record,-voi. i, no. i (March, 1937); see also Shaffer, 
L. F., The Psychology of Adjustment, Boston, Houghton Mifflin Company, 1936, fora mom 
comprehensive treatment of these adjustment-mechanisms. 
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Somewhat less socially desirable, but often personally suc¬ 
cessful, are the following mechanisms: 

4. Identification. In this case the individual does not achieve his 
goal, but he adjusts by identifying himself with persons or 
movements which tend to bolster his self-respect. 

5. Projection. The individual takes pride and appears to derive 
comfort and satisfaction in believing that others are like him 
with respect to his thwarted desires; he deflects attention from 
himself by calling attention to defects, problems, or conflicts of 
others. 

6. Egocentrism. In order to satisfy himself the individual resorts 
to a variety of attention-getting types of behavior. An extreme 
manifestation of this mechanism is recognizable in the patho¬ 
logical liar. 

The following mechanisms are neither desirable from the 
social standpoint nor do they aid the individual to achieve his 
original goal: 

7. Sympathism. This is a common mechanism. Many people 
seek for sympathy instead of facing their problems. Some get 
drunk, and others seek to be babied. Some threaten (or even 
attempt) suicide with the purpose of making others sorry. 

8. Regression. In this form of adjustment the thwarted individual 
exhibits such behavior as weeping, sulking, or pretending illness. 
This type of mechanism is inferior and infantile and is as danger¬ 
ous if successful as if it fails to aid in achieving the goal since 
individuals lose whatever capacity they possess to face and deal 
with facts. 

9. Dissociation. In this type of adjustment detour the individual 
develops a secondary personality which is free from the conflicts 
and problems of the first, but which does not solve his problem. 
Stevenson's Dr. Jekyll and Mr. Hyde is an example. Dissociation 
is also present in anesthesias, amnesias, fugues, and paralyses. 

10. Repression. By this mechanism the individual avoids his prob¬ 
lems or conflicts by repressing them and refusing to think about 
them. Besides failing as a satisfactory solution for the individ¬ 
ual, it involves loss of social status, and aids in the development 
of various phobias, obsessions, and compulsions. 

11. Negativism. This unsatisfactory mechanism is characterized by 
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refusal to recognize the existence of the problem or conflict and 
by stubbornness and rebellion. While most common in child¬ 
hood, if persisted in it may result in serious mental illness re¬ 
quiring institutional care. 

12, Day dreaming. By this mechanism the individual escapes his 
conflicts by solving them in imagination. He “pictures himself 
doing the deeds or possessing the things he most desir es. ” He 
may, however, imagine himself a hero, suffering for a good cause, 
or because he is misunderstood. To a limited extent, most in¬ 
dividuals utilize this mechanism; if persisted in the individual 
may find it impossible to escape from the world of unreality. 

13. Perversions. Perversions are undesirable types of substituted 
activities. Instead of adjustment by sublimation or transferred 
compensation, the individual engages in abnormal sex activities 
such as masturbation, and in extreme cases, in sadism, mas¬ 
ochism, exhibitionism, or bestiality. 

Humm-Wadsworth Temperament Scale. The basic prin¬ 
ciple in dealing with undesirable mechanisms is to aid students 
to face their problems and solve them by approved methods. 
Some departures from normal adjustment or failure to achieve 
normal well-balanced personalities may now be detected 
through tests, thus making diagnosis in limited areas more ob¬ 
jective and saving time where teachers cannot give the neces¬ 
sary time to extended or systematic observation. The Humm- 
Wadsworth Temperament Scale * was developed primarily for 
personnel work, and standardized on adults. It should be use¬ 
ful on the college level. Its major characteristics may be 
gathered from Table 21. 

The Bemreuter Personality Inventory.® This diagnostic 
device has been widely used in this connection to obtain meas¬ 
ures of dominance or submission, introversion and extrover¬ 
sion, and self-sufficiency, as well as a measure of neurotic tend¬ 
ency. A crucial percentile score has been determined in con- 

* Humm, D. G., and Wadaworth, G. W, Temperament Scale. Loa Angeles. Calif., Don¬ 
caster G. Humm, 1534. 

» Bemreuter, R. G. The Personality Inventory. Stanford University, Calif., Stanford 
University Press, 1931. 
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Table 21. Humm-Wadsworth Components 


Each group of traits dealt with is regarded as a component of tempera¬ 
ment, and is presumed to be present in some degree in every individual. 
Analysis of temperament requires measurement of each component. Diag¬ 
nosis consists of identifying the component which appears to predominate 
or to be emphasized in the temperamental make-up of a given subject. 

The components are listed below with the constitutional, rnental, or nerv¬ 
ous disorder in which an extreme degree of each is typically observed: 


Component 

1. “Normal" 

2 . Hysteroid or 
anti-social 

3 . Cycloid 

a. Manic phase 

b. Autistic phase 

4 . Schizoid 

o. Autistic phase 

b. Paranoid phase 

5 . Epileptoid 


Disorder in Which Characterizations 

Observed 

Primarily a control mecha¬ 
nism providing balance 

Hysteria, criminal- Ethically inferior motivation 
ism 


Manic-depressive 

psychoses 

Involutional mel¬ 
ancholia 


Elation, excitement, enthusi¬ 
asm, impatience 
Sadness, worry, timidity, 
malice, etc. 


Dementia praecox 

Paranoiac condi¬ 
tions 

Mental disorders 
allied with epilepsy 


Seclusiveness, shyness, sug¬ 
gestibility 

Stubbornness, conceit, sus¬ 
picion, contempt 
Meticulous planning, persist¬ 
ence, explosiveness 


The object of the temperament scale is to present a diagnostic analysis 
of temperamental mechanisms which underlie the characteristic reactions 
of a given individual to environmental situations and to other people. The 
measurement of intelligence and of aptitude does not provide adequate data 
regarding temperament. The authors indicate that “ disturbances are con¬ 
sidered mere extremes in reactions, or behavior, arising from the same 
basic temperamental equipment exhibited by normal individuals. The 
essence of normality appears to be that of a balanced condition which pro¬ 
motes controlled utilization of temperamental mechanisms. In abnormal¬ 
ity, certain traits or predispositions are ‘out of hand,' or stated otherwise, 
the positive influences which normally maintain temperamental balance 
have ‘let go ’ or ‘given way.' The underlying temperamental structure can 
then be observed in exaggeration.'' 


nection with the latter, below which students are usually web 
balanced emotionally, but above which they are advised to 
consult a physician or psychiatrist. 

Needless to say, no other outcomes of education can com¬ 
pensate for breakdowns in physical or mental health. Students 
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who experience significant learning difficulties which do not 
respond to ordinary measures should be checked for personal¬ 
ity disorders. 

The attitude scales edited by Remmers and those edited 
and developed by Thurstone will also prove helpful in adjust¬ 
ment diagnosis. 

The Bell Adjustment Inventory * provides four separate meas¬ 
ures of adjustment; namely, home, health, social, and emo¬ 
tional. Norms are provided, and the high reliability of the 
measures makes it possible to compare individuals with each 
other, and with more or less normal or average adjustment. 
The following are sample items from the Bell Adjustment In- 


ventory: 




Yes 

No 

? 

Are you subject to hay fever or asthma? 

« Yes 

No 

? 

Do you often have much difficulty in thinking of 
an appropriate remark to make in group con¬ 
versation? 

*■ Yes 

No 

? 

Have you been embarrassed because of the type 
of work your father does in order to support the 
family? 

^ Yes 

No 

? 

Do you frequently suffer discomfort from gas in 
the stomach or intestines? 

® Yes 

No 

? 

Did you ever take the lead to enliven a dull 
party? 

“ Yes 

No 

? 

Is (was) your mother the dominant member of 
the family? 

^ Yes 

No 

? 

Have you ever felt that someone was hypnotizing 
you and making you act against your will? 

• Yes 

No 

? 

Has either of your parents frequently criticized 
you unjustly? 

° Yes 

No 

? 

Do you feel embarrassed when you have to enter 
a public assembly after everyone else has been 
seated? 

Yes 

No 

? 

Do you often feel lonesome, even when you are 
with people? 

H. 

M. The Bell Adjustment Inventory. Stanford University, Calif., Stanford 


rjniversity Press, 1934. 
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a Yes No ? Do you feel there has been a lack of real affection 

and love in your home? 

“ Yes No ? In school is it difficult for you to give an oral 

report before the class? 

The California Test of Personality.* Like the elementary 
battery already described in a previous chapter, the ad¬ 
vanced battery provides for a comparison of the manner in 
which each pupil meets his problems with the manner in which 
pupils in general react to the same situations. Figure 45 
presents the diagnostic profile of the advanced battery. 

Significant departures from characteristic behavior, par¬ 
ticularly deviations so serious as to indicate possible or actual 
danger, are revealed graphically. 

Pupil reactions to items are not obtained primarily for secv 
tion or even total scores, but to reveal adjustment difficulties. 
Each group of related unsatisfactory responses, therefore, con¬ 
stitute, in a sense, a major objective of guidance. 

The authors call special attention to the fact that interpre¬ 
tations of the profiles should be made and plans of guidance 
should be projected not only in terms of the profiles them¬ 
selves, but also in terms of data on mental maturity, skill 
mastery, physical maturity, health, and the like, on which 
normal adjustment depends. 

This test is, therefore, an attempt to implement the legiti¬ 
mate idealism of the modern movement in education, which 
stresses the “wholeness” of the learner and the desirability of 
guiding his learning activities in the light of his total personal¬ 
ity pattern. Without an instrument of this type, this idealism 
must, for the most part, remain wishful thinking for many. 
For many others it must mean, not the use of carefully planned, 
validated or defensible procedures, but the continuance of im¬ 
pulsive, opportunistic improvisations of the moment, many of 
which do violence to the nature and needs of the learner as well 

» By Thorpe, L. P., Clark, WilUa, and Tlega, E. W. Lot Aageles, Calif., California Test 
Bureau, ip3p. 
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as to the goals toward which, his activities are presumably be¬ 
ing directed. 

The test will aid teachers in bringing together and organiz¬ 
ing important but illusive data so that they may converge on 
the identification and elimination of learning and adjustment 
difficulties. 

Cheating in examinations. Cheating in tests and examina¬ 
tions is a problem which is present on all levels, but as com¬ 
petition develops on the secondary and higher levels and more 
is at stake, it appears to be intensified. The problem is diffi¬ 
cult to deal with because parents rarely believe that students 
will cheat and they resent the resultant publicity. One student 
reported that two-thirds of her classmates in a mid-western 
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teachers college cheated regularly in examinations. An 
eastern high-school principal reported that it was a serious 
problem winch could not be adequately controlled even with 
the use of proctors. An instructor in a private boys’ school 
in a western state said he had cheated his way through school, 
and he proposed to "give the boys a break.” 

The reasons most frequently given for cheating, in a survey 
of college students, were as follows: 

1. Lack of preparation 

2 . Others cheat; I do it to protect my grades 

3. Lack of self-confidence 

4. Mob psychology; don’t really need to 

5. Parents never considered it serious 

Unfair quizzes and examinations; some questions never men¬ 
tioned in class or textbooks 

7. Fear of consequences; bad marks, punishment at home, failure 
to graduate or obtain certificates 

8. Unsympathetic attitude of teachers 

g. Overemphasis on examinations; too much depends on one final test 
10. Failure to punish cheaters adequately when caught 

The most frequently mentioned suggestions of students for 
solving the problem were as follows: 

1. Abolish marking 

2. Give everyone the same mark 

3. Proctor adequately; do not tempt us 

4. Give objective tests with time limits; no time to cheat 

g jjave different forms of objective tests and examinations so that 
no student will be doing the same test as those near him 

6. Individual private, oral examinations 

7. Make quizzes and examinations fairer 

8. Frequent talks by teachers, explaining the nature and reasons for 
examinations, and the probable result of cheating whether caught 
or not 

While undoubtedly some cheating is due to mere unfor¬ 
tunate habituation, there is no doubt that much of it is induced 
by the teaching and testing practices of instructors them- 
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selves. The motives for cheating would largely disappear if 
the following conditions were met: 

1. Let each teacher develop an adequate list of the objectives for 
each course, subject, or unit which he offers. 

2. Let assignments, lectures, discussions, laboratory work, etc., deal 
with these objectives. 

3. Use frequent diagnostic tests to discover unattained objectives, 
and do something to aid students, instead of merely marking 
them. Accumulate the results of these tests and use them as the 
partial basis of marking. 

4. Be sure that the final examinations are well-balanced samplings of 
the objectives which have been stressed in teaching and diagnosis; 
do not weigh the final examination too heavily in the final mark. 

It is a mistake to assume that someone must fail in any 
class. To do so sets up a vicious competitive spirit for marks 
that encourages the development of most undesirable traits 
and defeats some of the most important objectives of educa¬ 
tion. If the same students failed consistently, lack of ability 
might be assigned for failure, but very often students failed by 
some instructors do excellent work for others. Differences in 
the fields themselves do not appear as significant as the lack of 
teaching and appraising skill of instructors. Without stand¬ 
ardized appraisal examinations only the most extreme cases 
of lack of achievement can be identified. Competition for 
marks is not the most desirable motivation for encouraging 
scholarship. 

Crawford ^ has devised a technique for detecting suspected 
cheating, consisting of three parts: 

1. How to discover and recognize evidences of cheating 

2. How to determine the probability that cheating has occurred 

3. How to verify the results of i and 2 above 

While a student should always be presumed innocent until 
proved guilty, most guilty students confess when they see and 
understand the evidence. 

> Crawford, C. C. “Dishonesty in Objective Tests"; in School Review, vol. 38, pp. 776- 
8t. 
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Physical and health diagnosis. Health is an important ob¬ 
jective in its own right even though it were not related to other 
objectives. However, research reveals that health is definitely 
related to adjustment and to learning. For example, Sted- 
man computed the average number of grade points per semes¬ 
ter of high-school students with various disabilities and com¬ 
pared them with students free from these defects. Her results 
are presented in brief form in Table 22. 


Table 22. Relation Between Physical and Health Defects 
AND School Marks 


No. OF 
Studbnts 

Physical Condition 

Average 
Grade Points 
Per Semester 

39 

Healthy students 

11.36 

13 

With lung defects 

7.33 

136 

With defective teeth 

8-33 

91 

With eye defects 

8.88 

70 

With heart defects 

8-73 

312 

With posture and orthopedic defects 

9.58 


Pupils with I.Q,.’s above 115 appeared to be more seriously 
handicapped by health difficulties than those with lower 

I.Q.’s. 

To achieve most the students * should: 

1. Be free from organic or physical defects 

2. Develop native physical capacities to the maximum 

3. Develop organic functioning to a reasonable degree of perfection 

4. Develop interest in a variety of big- and small-muscle activities 

Many students suffer from remediable digestive difficulties 
resulting partly from the nervous tension under which they 
work. Such difficulties not only interfere with learning in the 
academic sense, but often produce other adjustment problems 
as well. 

' Study conducted at Bell High School, Los Angeles, Calif. 

“ I^Porte, W. R, " Changing Conception of College Physical Elducation”; in ReseareJi 
Quarterly, Arnerican Physical Education Association, vol. 2, p. S- 
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Longitudinal arch difficulties may be diagnosed from foot¬ 
prints. Schwartz and Rogers both secured coefficients of re¬ 
liability of over .90 in determining footprint angles. Tenta¬ 
tive norms for meii have been set at 42*^, with a critical score of 
30°; all individuals with angles below 30° should receive speciaJ 
treatment. 

Serious posture defects are usually evident, but the exact 
points at which remedial activity should be directed must still 
be determined through diagnosis- Lesser defects are uncov¬ 
ered by checking all students. In using the silhouettograph 
and similar instruments, examiners should not be satisfied with 
single outlines or pictures. Because of changes in posture 
from time to time a series of samples of posture are needed. 
Furthermore, devices like the conformator used by McCoy,’^ 
which come in actual contact with the spine and produce an 
objective permanent record, are useful not only in original 
diagnosis but in checking the effects of remedial activity. 

High-school and college work puts a heavy strain on eyes. 
Hence adequate periodic checks should be made as a matter 
of routine, and at other times when difficulty is experienced. 

Ideally, all students in high school and college (and else¬ 
where) should have complete physical examinations at regular 
intervals by qualified examiners, and results should be available 
to those who are responsible for their growth and development. 

The tools of learning. Perhaps the most frequent cause of 
difficulty in high school as well as college is lack of mastery of 
the learning skills, particularly of reading ability. Some sur¬ 
veys have shown as high as thirty per cent of some high-school 
groups with reading deficiencies; college classes contain approx¬ 
imately fifteen per cent of students who cannot read suffi¬ 
ciently well to do good work. Business schools report that 
high-school and college graduates are atrociously deficient in 

^ McCoy, C. H. “Measurement of General Motor Capacity and General Motor Abil¬ 
ity"; in Research Quarterly, American Physical Education Association, vol. 4, p. 87. 
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Knglish, spelling, and other fundamentals. Wh 
tent of courses changes, certain skills necessary fc 
changing content and objectives are standard i 
Every teacher should have a record of status in 1 
every student who is experiencing difficulty. 

Figure 46 presents a typical diagnostic profile ^ 
checkup of these basic skills- The proportion o 
the left of the heavy vertical line indicates the e> 
the student falls below the general average with 
various skills tested. Whereas I.Q. predicts ac 
the extent represented by a correlation coefiicien 
the reading score of such a test has a higher pre 
whiie the total score sometimes predicts to the 
sen ted by a correlation coefficient of .85 which : 
the predictive value of most tests of specific apt 
the profile of a student experiencing difficulty i 
cant weaknesses in such important skills as res 
hension or reasoning, the solution of the prot 
among other things, improvement in such skills. 

The discovery of reading disability as a factor 
tory learning does not necessarily reveal its caus 
ability may be due to a number of factors, amc 
rate of reading, mental immaturity, visual defect 
reading materials, and poor eye movement. 

Reading rate is significant only when the mab 
comprehended, hence a comprehension test shot 
company the measurement of rate. 

Certain aspects of the difficulty of much of th 
terials used on the primary and elementary leveh 
been determined. Such materials may now h< 
propriate levels, and assignments made with resp 
abilities and disabilities. On the high-school, an 

* From the Progressive Achieoemenl Tests, Advanced Battery. Los i 
fornia Test Bureau, 1935. 



A Test Given In January to a 10th Grade Pupil 

DIAGNOSTIC PRORLE 


Possible 

TEST SUBJECT Score 

1. Reading Vocabulary . . 100 

A. Mattiematics. 25 

fi. Science ....... s . 25 

C. Social Science.25 

D. Literature 25 

2'. Reading Comprehension. 55 

E. Following Directions . ^ . 10 

F. Organization.. . 15 

G. Interpretations ...... 30 

3. Mathematical Reasoning 60 

A. Number Concept. . , , . 20 

B. Symbols and Rules i . . 15 

C. Numbers and Eguations . 10 

D. Problems .. • 15 

4. Math. Fundamentals . . 80 

L Addition.20 

F. Subtraction .. . 20 

G. Multiplication.20 

H. Division ......... 20 

5. Language 125 

A. Capitalization.15 

B. Punctuation.10 

C. Words and Sentences , . 25 

D. Grammar ........ 30 

E. Spelling ......... 30 

F. Handwriting ....... 15 

TOTAL. 420 



FIG^ 46. DIAGNOSTIC PROFILE, PROGRESSIVE ACHIEVEMENT 
TESTS, ADVANCED BATTERY 
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the college level, much less has been done. A number of work¬ 
ers ^ have contributed to the solution of the problem. Lew- 
erenz has developed five measures for evaluating vocabulary: 

1. Vocabulary Difficulty Grade Placement, a measure of the ratio 
of special meaning words derived largely from Greek and Latin 
sources to common words derived largely from Anglo-Saxon roots. 

2. Polysyllabic Word Grade Placement, a measure of the frequency 
of certain words having two or more syllables. 

3. Vocabulary Diversity Grade Placement, a measure of the ratio 
of the different words above the first 500 in frequency to the 
number of different words below the first 500 in frequency. 

4. Vocabulary Mass Grade Placement, a measure of the frequency of 
different words above the first 500 in frequency to the total num¬ 
ber of different words found in a sampling of one thousand running 
words. 

5. Vocabulary Interest Rating, a measure of the ratio of different 
interest words found to the total number of different words re¬ 
corded in a sampling of one thousand running words. In count¬ 
ing interest words a basic check list,of adjectives and adverbs is 
utilized. 

Washbume = has developed a method for determining grade 
placement of children’s recreational reading materials. Thorn¬ 
dike 3 has attacked the problem not only from the standpoint 
of vocabulary, but also of difficulty of sentence structure. It 
is of course evident that none of these is a complete measure of 
difficulty. A word, for example, may be used several ways or 
in different constructions, and may therefore have several 
measures of difficulty. Construction difficulty also varies with 
such factors as differences in vocabulary and types of reading 
materials. Such factors should be considered in interpreting 
the results of grade-placement studies. 

» See Lewerenz, A. S. Techniques for the Objective Evaluation of the Vocabulary Used in 
Printed Matter. Ed.D. Thesis. University of Southern California, ip37; also Vocabulary 
Grade Placement Formula, published by the Research and Guidance Section of the Los 
Angeles City School District. 

* Washbume, Carleton. Winnelka Chart for ’Determining Grade Placement of Children's 
Books. Wiimetka Educational Press, Horace Mann School, 1934. 

* Thorndike, E, L. "Improving Ability to Read”; in Teachers College Record, Columbia 
University, vol. a 6 , no. i, pp. 12, 13, iS- (October, 1934.) 
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Eye movements may be photographed with the Ophthalm- 
O-Graph ^ and similar instruments as on lower levels, and rate 
of reading as well as the number and duration of fixations and 
regressive movements determined. Figure 47 presents two 
reading graphs, specimen “A” that of a slow reader whose 
rate is but 152 words per minute, and “B” that of a student 
who reads 585 words per minute. 

Data published by the American Optical Company also re¬ 
veal that students on the high-school and college levels, as well 


A 


B 


FIG. 47. READING GRAPHS OF SLOW AND FAST READERS 



Distributed by the American Optical Company. 
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as the primary and elementary, respond to training with the 
Metron-O-Scope and that reading rates can be gmatly in- 
creased Table 23 presents norms for rate, fixations, and 
regressions for seventh-graders, and the averages for high- 
school and college students. 


Table 23. American Optical Company Eye-Movement Norms 
(Data from Ophthalm-O^Graj^ 

^ Metron-O-Scopic training. Based on silent reading of 
standard material for the various grade levels) 






Average reading rate, words per 

. . 210 295 

minute.’A • 

Averages after Metron-O-Scopic 

. ■ ; . 20O 35 ^ 4^5 

training. . . 

Average number of fixations 

per 100 words... ^*5 93 

Averages after Metron-O-Scopic 

training. . _ 

Average number of regressions 

per 100 words.• 21 17 10 

Averages after Metron-O-Scopic 

training. . ^ _1_ 

To use the table: Compute data of film record and compare the number 
of fixations and regressions and the speed with the appropriate grade level. 

For instance, using the first line of figures m each item — if the subject s 
reading rate is 295 words per minute he is at the high-school level in terms 
of soeed. If he makes 110 fixations in reading 100 words he is better than 
seventh grade in terms of fixations; and if he makes 17 regressions in reading 

100 words he is average for high-school level. tnVpn 

The second line of figures in each item provides comparable data take^n 
from the reading graphs of subjects who had received Metron-O-Scopic 
training. 


The causal factors of difficulties in thinking or reasoning 
should be sought in a similar manner. Quantitative thinking 
involves reading, mathematical, and language abilities, and 
causes of difficulty must be sought in these among other areas. 
Failure to master fundamentals of arithmetic reduces ability 
to do mathematical reasoning. Failure to read and compre- 
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hend the conditions of a problem leads to incorrect solutions. 
The reasons for failure to reach correct conclusions must oc¬ 
casionally be determined by individual oral examinations where 
the student tells why he takes each step in the attempted solu¬ 
tion of a problem, and in addition actually attempts the solu¬ 
tion as he talks. 

The following are among the most common reading difficul¬ 
ties encountered on this level: 

1. Reads too slowly; can’t cover assignments 

2. Reads too fast; does not comprehend 

3. Recognition span too narrow; too many fixations 

4. Too small reading vocabulary; too many new words, poor com¬ 
prehension 

5. Misreads familiar words; visual defects, carelessness 

6. Does not follow lines consistently; visual defect or habit 

7. Poor memory; does not remember or assimilate previous ideas in 
the selection read 

8. No interest in reading; neglects it when possible 

The results of the comprehensive and discriminating Dart¬ 
mouth College study ^ cast some doubt upon the significance 
of reading rates and reading-rate gains but confirms certain 
other beliefs regarding the relation between rate and compre¬ 
hension. Among the conclusions of this study are the fol¬ 
lowing : 

1. Efficient reading, as measured by reading tests and by the eye- 
movement camera, does not guarantee high academic performance 
within the Dartmouth group. 

2. Correction of ocular defects does not guarantee increased ability 
in reading or greater academic achievement within one college 
year of those receiving ocular correction; however, eighty-three 
per cent report that they were helped. 

3. The records obtained by the eye-movement camera are so un¬ 
reliable that individual diagnosis of difficulty in reading cannot 
be made on the basis of scores derived from them. 

* Imus, H. A., Rothney, J. W. M., and Bear, R. M. An Evaluation of Visual Factors in 
Reading. Hanover, N.H., Dartmouth College Publicatlona, 1938. 
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4. Study of the relationship between the scores on the eye-movement 
camera and reading tests indicates that these instruments are not 
measuring the same ability. 

5. The frequency of regressive eye movements as measured by the 
eye-movement camera is not necessarily a handicap to efficient 
reading. 

6. Speed of reading can be increased by instruction and supervised 
practice. Such gains are retained for a period of at least one year. 

7. Increase in speed may be secured without loss of comprehension. 

8. Academic performance is seldom improved by increasing the rate 
of reading. 

The fact that reading ability does not guarantee academic 
performance does not of course mean that reading ability is 
not important and that a lack of reading ability may be ig¬ 
nored. The results obtained suggest rather that reading abil¬ 
ity is not the determining factor in academic success, as is often 
assumed. The fact that reading ability and academic success 
are not necessarily increased by correction of ocular defects 
may mean that compensating patterns have been developed 
and set by the time certain students reach college and that 
they succeed in spite of handicaps. As the study itself sug¬ 
gests, the unreliability of eye-movement rate records may be 
due to the restricted sampling, and, further, if eye-movement 
photograph records were made five to seven times as long, the 
reliability of reading-rate data would be greatly increased. 
Considerable discrepancy between eye-movement scores and 
reading test scores is to be expected. The former are records 
of physical movements; the latter include the complex factors 
involved in comprehension as well as reading rate. The study 
confirms the fact that reading rate may be increased without 
loss of comprehension. The apparent lack of relationship 
between increased reading rate and academic success may be 
due to a number of factors, chief among them being the un¬ 
reliability both of the reading-rate scores and the criterion of 
academic success. 
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The study illustrates the constant danger of misinterpreting 
the significance of individual factors in complex learning pat¬ 
terns and emphasizes the necessity for additional research in 
the potential contributions of eye-movement photography. 

Prediction and educational guidance. Much later difficulty 
and disappointment would be avoided if success in various 
types of high-school and college work could be predicted, and 
if, where a choice exists, students could be directed to fields in 
which they have a better chance to succeed. Considerable 
work has been done in prediction, but certain conditions make 
progress difficult. Reading ability, for example, is apparently 
closely related to success, yet when high-school and college 
instructors use subjective and traditional methods of marking 
rather than objective evidence, measures of correlation appear 
low. In the same manner, prediction of success in college from 
the results of good achievement or specific aptitude tests may 
be low because of the lack of validity of college marks. Segel’s 
summary of a great many predictive studies indicates that 
general achievement tests come first, general mental tests sec¬ 
ond, and tests of specific traits, aptitudes, or achievement come 
third in predictive value, the average being coefficients of 
.545, .44, and .37 respectively. 

As long as we must use unreliable marks as the standard or 
criterion by which to judge the validity or predictive value 
of prognosis tests we cannot expect to increase the usefulness 
of such tests. When high-school and college instructors use 
as good tests in measuring achievement as are used in its pre¬ 
diction, student guidance possibilities from such tests, even in 
the hands of relatively untrained workers, will be greatly in¬ 
creased. 

The values of present measures may be increased through 
differential prediction. In this method, correlation coefficients 

* Segel, David. Prediction of Success in College, p. 69 . Office of Education Bulletin, no 

IS. 1934 - 
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of several predictive tests are utilized to find multiple correla¬ 
tion coefficients for predicting success in various fields. The 
statistical significance, if any, between these predictive coeffi¬ 
cients is then computed if they are sufficiently high to be 
promising. Segel ^ utilized multiple correlation coefficients of 
.50 or higher, and computed differential prediction equations. 
Differential as well as direct prediction will be increased 
through better achievement testing. Most teachers will be 
interested for some time to come in tests of specific aptitude 
which may be utilized without any particular knowledge of 
statistical procedures. 

Besides current intelligence and achievement tests several 
tests of more specific aptitudes are utilized for predictive pur¬ 
poses. Among them are tests in mathematics, science, Eng¬ 
lish, music, languages, and mechanical ability. 

Lee’s test of algebraic ability consists of four parts, arith¬ 
metic problems, analogies, number series, and formulas. It 
possesses a reliability of .93 and a coefficient of validity of .71 
to .76. The most probable per cent of failures at each score 
level may be determined. 

The Lee Test of Geometric Aptitude * is prepared in two 
forms and may be used to predict the probability of success in 
geometry as an aid to counseling and guidance. The high 
validity of this test is indicated by capacity to predict achieve¬ 
ment corresponding closely to later achievement test results. 
Figure 48 presents sample items from form A. 

Similar prognosis tests have been devised by Orleans for al¬ 
gebra and geometry. The Zyve Stanford Scientific Aptitude 
Test is available in science. The Gross English test is widely 
used as a placement and exemption test. Orleans and Solo¬ 
mon, and Lauria and Orleans, have devised prognosis tests in 

> Segel, David. Prediction of Success in College. Office of Education Bulletin, no. IS, 
1034, P. 82. See also ‘‘Differential Prediction of Ability as Represented by College Sub¬ 
ject Groups"; in Journal of Educational Research, vol. 35, pp. 14—36 and 93—98. 

' Lee, J. Murray. Tesl of Geometric Aptilude. L08 Angeles, Calif., California Test 
Bureau, 1931. 
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Look at the figure. You are given the length of some of the lines. 
Find the length of the lines and perimeters asked for and put the 
answers in the answer column. The numbers on the lines indicate 
the length of the line between the letters. 



Answer Column 


1. How long is HFl (i) 

2. How long is A H? (2) 

3. How long is .(3) 

4. How long \s, JII (4) 

5. What is the perimeter of / 7 GJIF^ .(5) 


FIG. 48. SAMPLE ITEMS FROM THE LEE TEST OF GEOMETRIC 

APTITUDE 

Latin and modern foreign languages respectively. Informa¬ 
tion concerning these and other similar tests may be found in 
bibliographies and the catalogues of publishers. 

There is much confusion regarding the meaning and func¬ 
tions of guidance. In one sense all teaching should be guid¬ 
ance, since it should be the direction of learning in the light of 
objectives set up and in terms of interests, needs, and abilities 
of pupils- Williams ^ found many different types of activities, 
some of which did not contribute very directly to the above ob- 

' Williams, Don T. Some Values and Limitations of Guidance Programs in Seven Selected 
Public School Systems in Southern California. M.B. in Education Thesis, University of 
Southern California, 1937. 
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jectives, designated, as guidance in his study of Southern Cali¬ 
fornia school systems. 

Vocational and professional interests and abilities. When 
prognosis tests are used in educational guidance the criterion 
or standard of success is achievement in high school or college. 
In using such tests in vocational guidance actual success in the 
vocation is the criterion. Like instructors’ marks in high 
school and college, ratings of vocational success are not highly 
reliable, hence no high correlations can be obtained between 
tests and estimates of vocational success. 

Diagnosis in vocational interests and learning center around 
four factors 

1. Actual vocational and professional opportunities of a community 
and surrounding communities 

2. Tests and measurements of student capacity, achievement, in¬ 
terests, traits, and abilities 

3. Organization of the learning program in terms of basic vocational 
and professional opportunities 

4. Organization for contacts and interchange of information between 
instructors and students and vocational and professional op¬ 
portunities 

A special department should be organized to handle the first 
and fourth factors. The second and third are related to test¬ 
ing. 

Table 24 presents the mean scores for selected groups of the 
Allport-Vernon ^ scale for measuring dominant interests. 

This scale illustrates the discovery of attitudes through uti¬ 
lizing specific factual experiences rather than through the 
“conclusions of the witness” method so often attempted. The 
manual of directions indicates the manner in which the ob¬ 
tained data may be utilized. 

Other more specific interests in the commercial, mechanical, 
and scientific fields may be determined with the use of other 

* Allport, George, and Vemon, P. E. A Study of Values. Boston, Houghton Mifflin 
Company, 1931. 
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Tablb 24. Mean Scores for Selected Groups on Allport-Vernon 

Study of Values 


Sex Diffkrencks 

Thko- 

RETICAL 

B!co- 

NOMrc 

Aes¬ 

thetic 

Social 

Politi¬ 

cal 

Reli¬ 

gious 

1163 males. 

30-83 

32.02 

27.04 

29.74 

32.08 

27.96 

1592 females. 

27.69 

27.04 

33.03 

3 i -<^5 

27.S7 

33-31 

2755 combined. 

29.26 

29.53 

30.04 

30-69 

29-98 

\ 30.63 

ILLUSTBATTVE OcctJPATIONAI- 







Differences 







26 students of science 





1 


(male). 

34-3 

28.S 

28.6 

29.4 

33-4 

25.8 

64 students in engineer- 







ing.. 

3^-0 

35^1 

26.1 

29.0 

31-4 

26.4 

125 commercial students 







(male). 

32.2 

34-5 

21.9 

28.7 

31.8 

30.9 

81 salesmanship stu- 







dents (male).. 

27.1 

38.3 

24.2 

26.8 

36.4 

25.5 

51 students of law (male) 

26.0 

30-3 

28.0 

30-7 

37-1 

27.6 

24 students of liter- 







ature (female). 

23.7 

27.2 

40.2 

29.4 

29.5 

30.1 

80 missionaries (both 







sexes). 

27.0 

23.0 1 

26.3 

35-0 

21.8 

48-9 

26 Boy Scout leaders, . . 

31-6 


21.5 

32.3 

28.5 

33 -S 


Note: The mean scores for these occupational groups are found to be decidedly significant 
when studied in the light of the statistical reliability of the differences. 


tests. Figure 49 presents items 21 and 22 of the Thurstone 
Vocational Guidance Test in Physics. 

Many other general as well as specific tests of clerical and 
mechanical abilities have been devised, among them the I.K.R. 
and Thurstone Clerical Tests, and the MacQuarrie, Minne¬ 
sota, and Stenquist tests of mechanical ability. 

Certain attitudes important in different professions and vo¬ 
cations may also be determined through tests. Thurstone has 
edited and devised tests ^ for measuring a wide variety of such 
social attitudes. 

Various character traits, or more or less habitual adjustment 
patterns, such as introversion and extroversion, and ascend¬ 
ance-submission, may also be detected. The A—S Reaction 


Chicago, University of Chicago Press, 1930 and 193 r. 
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diagnosis and the direction of learning 



21. What force is necessary at R in Figure 2 to balance the weight? 
Neglect the weight of the lever arms. 

Answer.lb. 

22, Figure 3 shows a tank of water with two pistons. The force on 
the upper piston rod due to water pressure is lb. What will 
be the force due to water pressure on the lower piston rod? 


FIG. 49 . ITEMS FROM THURSTONE VOCATIONAL GUIDANCE 
TEST: PHYSICS 


Thiiratone, L. 
Book Company, 


L Vocational Guidance Tests: Physics. Copyright, 1919, by World 
Yonkers-on-Hudson, New York. Reproduced by written permission. 


Study ^ consists of two forms, one each for men and women. 
It is self-administering and requires no timing. The following 
illustrative items are taken from the test: 

I, Some possession of yours is being worked upon at a repak shop. 
You call for it at the time appointed, but the repair man informs 
you that he has “only just begun work on it.” Is your custom¬ 
ary reaction 

to upbraid him?.... • • • 

to express dissatisfaction mildly?. 

to smother your feelings entirely?. 

- Allport, G. W.. and Allport, F. H. The Ascendance-Submission Reaction Study. (A 
scale for measurement.! Boston. Houghton Mifflin Company, ipsS. 





















DIAGNOSIS IN EDUCATION 


235 


2. After a very tiring day you decide to keep your seat in a crowded 
street-car even though ladies have to stand. You overhear one 
of the ladies refer to the situation in some remark to her compan¬ 
ion. Do you 

rise and offer your seat?. 

remain in your seat feeling ill at ease?. 

remain in your seat without embarrassment?. 

3. You are at a mixed party where about half the people are friends 
of yours. The affair becomes very dull, and something should be 
done to enliven it. You have an idea. Do you usually 

take the initiative in carrying it out?. 

pass it on to another to put into execution?. 

say nothing about it?. 

Those whose scores indicate submissiveness would not be 
advised to go into fields calling for managerial skill; those who 
appear dominant might be so advised if other factors are favor¬ 
able. The Manual indicates certain cautions which should be 
observed in interpreting and utilizing the obtained results. 

QUESTIONS AND EXERCISES 

1. Contrast diagnosis and appraisal practices on the elementary, 
secondary, and college levels. 

2. In what manner may tests of mental maturity be utilized on the 
secondary level? 

3. How does this differ from the use of the psychological examination 
used on the college level? 

4. Indicate reasons why adjustment diagnosis is important on the 
high-school and college levels. 

5. Outline a program for each level, explaining how you would ad¬ 
ministrate it. 

6. Explain what you would do with the cheating problem on the high- 
school and college levels. 

7. Outline the problems and techniques of health and physical 
diagnosis. 

8. What evidence is there of lack of mastery of tools of learning on 
the secondary and college levels? In what manner should such 
shortages be treated? 
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9. Explain what use, if any, should be made of such instruments as 
the Ophthalm-O-Graph and the Metron-O-Scope on the secondary 
and college levels. 

10. In what manner are prediction and diagnosis related? 

11. Describe one prognosis test and tell how to utilize it. 

12. In what manner should the high school and college utilize voca¬ 
tional and professional interests and abilities. 
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CHAPTER TEN 


DIAGNOSIS AND APPRAISAL IN 
SECONDARY AND HIGHER 
EDUCATION {Concluded) 


While the basic principles of diagnosis and appraisal are simi¬ 
lar on all levels, the context of attitudes and conditions in 
which they must function is different. In the past, mental- 
maturity and prognosis tests have gradually found a place on 
the secondary and college levels. More recently teachers have 
shown a distinct interest in tests of personality and adjustment. 
Serious problems have forced attention to health difficulties 
and to the lack of mastery of the basic tools of learning. This 
chapter deals with problems of the subject fields related to the 
improvement of adjustment and learning activities. 

Diagnosis of difficulties in subject fields. The previous 
chapter has been concerned with certain factors which have 
conditioned teacher-training and teachers’ attitudes; factors 
of mental maturity; types of adjustment diagnosis; health 
difficulties and physical defects; the basic tools or skills of learn¬ 
ing; prognosis testing and testing for vocational interests and 
aptitudes; and certain character traits. There remains the 
diagnosis of learning difficulties in the various subject fields in 
the light of the above types of information, together with diag¬ 
nostic testing in these more specific fields. 

Space will not permit the detailed discussion of the many 
tests available. As in the case of other appraisal procedures, 
they may be found in bibliographies,^ indexes, and publish- 


* See Hildreth, G. H., A Bibliography of Mental Tests and Rating Seales. Published by The 
Psychological Corporation, New York, Buros, Oscar K., Educational, Psychological, and Per¬ 
sonality Tests, Rutgers University, New Brunswick, N.J., and similar works. 
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ers’ catalogues. The point of departure in all diagnosis, 
ideally, is normative information such as is furnished by a 
standardized test. Without such information, instructors 
may guess, surmise, or believe that certain standards of 
achievement have been attained, but they do not know. When 
carefully validated tests indicate that progress and attainment 
are normal in the light of other data, no special diagnosis is 
indicated. When such tests reveal difficulty areas, more spe¬ 
cific diagnosis of learning difficulties should be undertaken. 

Unfortunately, few standardized tests have been devised on 
the college level, and many good tests developed for high 
schools are almost unknown among high-school teachers. 
The general nature of a limited number of such tests will be 
considered. 

Diagnosis in reading and English. Among the good high- 
school and college reading tests is the Nelson-Denny.^ This 
test for colleges and senior high schools consists of two forms 
of two parts each, Part I a vocabulary test, and Part II a test 
of ability to read and comprehend the paragraph. 

The test is designed (i) to predict probable success in col¬ 
lege; (2) to section incoming college or high-school classes; (3) 
to aid in the diagnosis of student difficulties. 

A careful set of directions aids the instructor to administer 
the test. The student responds in an answer booklet equipped 
with the Clapp-Young self-marking device, which eliminates 
the usual drudgery of scoring, and also makes possible the 
repeated use of the test folder. The test is compact, well 
organized, and easy to administrate. Percentile norms are 
available for senior high-school, college, and university 
students. 

Writing and interpretation difficulties have largely defied 
detection and analysis. The general merit of English compo- 

* Nelson, M. J., and Denny, E. C. The Nelson-Denny Reading Test. Boston, Houghton 
MifBin Company, 1929. 
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sition has been evaluated by scales of the type devised by 
Hudelson,^ an excerpt of which is presented: 

0.009 (No. l) A SUCRWBAL BIAHT ON StTERS ILE, 

scuball ard the boy sur and the boy the hag at the fors fon 
• v boys the gain soe sunbright. 


2.0 


2.009 (No. 3) A Snow battle on Slatters Hill 

The south end boy said that they could lick the north end 
boys so the North build a fort on Slattye hill. On the after 
noon of wednesdy The South advance with their boys which 
was fifty strong. The first ball was fired by our general Har¬ 
rison who jit the South general in the pit of the Stomact and 
this started the fight, that night the marched of defeated. 


4.0 


4.000 (No. 8) A Snowball Fight on Slatter’s Hill 

The north enders had built a fort on top of Slatter’s hill 
which covered about one acre. 

The next morning when the south enders found out that 
the north enders had built the fort they started to get the 
army together. 

With about fifty they attacked the fort. Several boys were 
taken prisoners. 

With Jack Harris as leader of the northenders they won. 
after the battle Jack had a bunged eye but he praised his 
men much for the work they had done 


6.0 


5.994 (No. 13) A Snowball Fight on Slatter’s Hill. 

Slatter’s Hill was a barren sort of place, a kind of “No 
Man's Land.” This hill was the imaginary boundary line 
between the North and the South Ends of town. Each End 
had been trying to obtain possession of Slatter’s Hill. 

One evening the North End, under command of Jack Har¬ 
ris, took possession of Slatter’s Hill. They immediately built 
breastworks of ice and snow. The next afternoon the South 
End, fifty strong, came up presumably to take the Hill away 
from the North Enders. A terrible snowball fight ensued, 
during which several people were taken prisoner; but when the 


* Hudelson, Earl. Typical Composition Ability Scale. 
Publishing Company, 1P23. 


Bloomington, Ill., Public School 
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terrific battle ended the North End was still in possession of 
Slatter’s Hill. The only casualty was the bruised eye of Jack 
Harris. Jack said, “ Fm mighty proud of you boys and shall 
never forget the fight on Slatter’s Hill.” 

In this type of product scale, the instructor reads the story 
aloud and allows time for the student to write it in his own 
words. Student compositions are then compared with the 
scale and appropriate values assigned. The work of students 
below standard is then diagnosed more specifically. 

Van Wagenen ^ evaluates exposition, narration, and de¬ 
scription separately, providing a scale for each. Each type of 
composition may be evaluated on content, structure, and me 
chanics, giving a more detailed and useful diagnosis. While 
these scales have been carefully developed, many teachers have 
not been willing to give the time necessary to learn to profit by 
them, and generally attempt to evaluate all aspects of compo¬ 
sition by one reading. 

Many tests have been devised to evaluate other outcomes 
in English, among them the Stanford English and American 
Literature Tests, the Barrett-Ryan Literature Test, the vari¬ 
ous Iowa tests in literature, and.Logasa-Wright Seven Tests 
for the Appreciation of Literature. Larm and Odell have 
devised twenty-five comprehensive tests for various readings 
in English. Among the Stanford Tests for Junior Colleges, 
the tests in English will prove useful. 

Commercial subjects. Nearly one-third of all students in 
some high schools are taking commercial subjects, among 
them shorthand, typewriting, bookkeeping or accounting, and 
commercial arithmetic. Errors are easily diagnosed in both 
shorthand and typewritirig; in fact, some very successful teach¬ 
ers employ daily timed tests. Others avoid testing in the belief 
that it is dangerous or inimical to learning, which is not only 

* Van Wagenen, M. J. English Comiiosition Scales. Yonkera-on-Hudson, N.Y., World 
Book Company, 1923. 
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contrary to everyday experience but to the results of research. 
The mere knowledge that tests will be given increases achieve¬ 
ment. Whether or not students will be nervous is primarily a 
matter of the atmosphere which the teacher creates and main¬ 
tains. Since testing is a part of the teaching cycle, students 
accept it as a matter of routine. Furthermore, there is no 
way to determine attainment without tests. 

The fundamental principle here as in other skill learning is 
the prompt identification and correction of errors before they 
become automatized. Since the emphasis on reading rather 
than writing during the early weeks of the course seems to be 
growing, diagnostic testing will emphasize the reading type 
objectives. By circulating among pupils as they work, many 
of the causes of errors can be detected and corrected while work 
is in progress. However, a periodic test to determine the status 
of these skills is desirable in order to determine the success with 
which the class as a whole as well as each student is progressing. 
Various business houses have definite minima in taking notes, 
transcribing, and typewriting; periodic tests reveal the extent to 
which those who will shortly leave school are nearing these goals. 

Figure 50 presents two samples from the Hoke Measuring 
Scale ^ for shorthand, which consists of sixteen samples with 
scores ranging from o to 95. As in the case of longhand writ¬ 
ing scales, students compare their own work with the scale and 
assign to it the value of the sample which it most nearly re¬ 
sembles. It is not necessary for learning that scaling be ex¬ 
tremely accurate; approximations are sufficient. The scale 
serves not only for evaluation but also as a model. Most text¬ 
books have only perfect or near perfect samples, and students 
have no other criteria for judging their progress. 

When students are significantly below standard, a more de¬ 
tailed diagnosis is indicated as in longhand writing. Among 
the diagnostic tests available is that of Rollinson by the same 
publishers. 

* Hoke, Elmer. Measuring Scale far Gregg Shorthand, New York, Gregg Publishing 
Comiiaiiy, 1922. 
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The determination of rate and of the number of errors con¬ 
stitutes appraisal in typing and is relatively simple. Since 
standards are also set in these terms it is unnecessary to at¬ 
tempt to combine them into a composite score which has less 
meaning. When students are not up to standard, the causes 
must be sought among such immediate factors as general posi¬ 
tion, hand position, nature of stroke, handling of space bar, shift 
keys, carriage return, and back spacer. When these do not re¬ 
veal the causes of poor work, vision, hearing, motor co-ordina¬ 
tion, aptitude, capacity, and other factors may be investigated. 

Many tests in bookkeeping and a few in commercial arith¬ 
metic and commercial law are also available. For the most 
part they are designed for appraisal purposes, but a study of 
the items missed by individuals and a tabulation of class errors 
yield useful diagnostic information. However, the Elwell- 
Fowlkes Instructional Tests in Bookkeeping, a series of sev¬ 
enty-eight tests, may be used for both diagnosis and appraisal 
during the first year. 

Mathematics. Figure 51 is a sample from the Webb Geom¬ 
etry Tests,*^ It consists of five parts which are scored sep¬ 
arately and which reveal relative success in five areas. 

The American Council has sponsored the development of 
a large number of excellent tests, among them one in trigo¬ 
nometry.^ Figure 52 consists of sample items from Form A. Like 
the Webb Geometry Test, it consists of five parts which are 
scored separately, and reveals status in these different areas. 

The major problem in the teaching of high-school and col¬ 
lege mathematics is to find real as contrasted with the described 
problems of most textbooks. The major objective is not only 
to master the processes, but to gain control of them as tools of 
quantitative thinking. Most tests here, as on the elementary 

» Webb, Paul. Webb Geometry Tests. Bloomington, Ill., Public School Publishing 
Company, 1926. 

• Orleans, J. B., Raudenbush, H. W.. Siceloff, L.P., and Wood, B. D. American Councii 
Trigonometry Test. Yonlcers-on-HudsoD., N.Y., World Book Company, 192S. 



Directions: Complete the statements given below by filling in the blank 
spaces. In connection with each statement study the figure and the facts 
which are given concerning it. Each statement is an individual exercise 
in itself and has nothing to do with any other statement. 

Given: 

OL a radius perpendicular to 
diameter AB 

KL tangent to circle 0 at 
point L 

CD perpendicular to AB 

Statement: 

1. If angle i is 75°, then angle 2 is. 

2. Triangle ABC is similar to triangle BDC. Therefore AB and 
.are corresponding sides. 

3. If AD is 8 and CD is 12, then DB is. 

4. If is 9 and BC is 12, then AB is. 

5. If CM is 8 and MJ is 5 and MB is 4, then AM is. 

FIG. 51. SAMPLE ITEM FROM WEBB GEOMETRY TEST 




FIG. 52. SAMPLE ITEMS FROM THE AMERICAN COUNCIL 
TRIGONOMETRY TEST 


C 
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level, are confined largely to problems which are built to pro¬ 
vide opportunity to manipulate processes. It is relatively 
much more difficult to provide test situations in which the con¬ 
ditions must be discovered by the student. 

With the exception of algebra and trigonometry, there are 
practically no standardized mathematics tests on the univer¬ 
sity level. 

Science tests. Among science tests available on the high- 
school level, none is more useful than the Blaisdell ^ series, 
consisting of twenty-five separate diagnostic tests on as many 
aspects of biology, totaling some 1250 items. Items have been 
carefully selected from leading textbooks and courses of study. 
Each test may be used as a pre-test or after the objectives of the 
topic have presumably been attained. In addition to the diag¬ 
nostic aspect of each test, norms are available for comparative 
purposes. Figure 53 presents a number of identification items; 


11 . Directions: In each pair of parentheses at the right of the page put 
the name of the part of the drawing that has the same number as 
the parentheses in which you write your answer. (Some of^^the 



A stage in the 
of the bean. 


A complete leaf. 


A cross-section of a 
fleshy root. 


of the drawings.) 

29 . ( 

) 29 

30 . ( 

) 30 

31 . ( 

) 31 

32 . ( 

) 32 

33 . ( 

) 33 

34 . ( 

) 34 

35 . ( 

) 35 

36 . ( 

) 36 

37 . ( 

) 37 

38 . ( 

)38 

39 . ( 

) 39 

40 . ( 

) 40 


PIG. 53. SAMPLE ITEMS FROM THE BLAISDELL INSTRUCTIONAL 

TESTS IN BIOLOGY 

» From Blaisdell, J. G. Instructional Tests in Biology. Copyright, 1931, by World 
Book Company, Yonkers-on-Hudson, N.Y. Reproduced by written permission. 
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The Ruch-Cossman Biology Test has been prepared for both 
high-school and college use. Content has been selected in 
harmony with the best teaching practices. There are two 
forms. The Co-operative, Co-oprider, and Oakes-Powers are 
among other tests of this type available in biology. 

From the standpoint of diagnostic teaching, as well as ap¬ 
praisal, the Curtis-Caldwell-Sherman “Tests for Biology for 
Today ” are outstanding. They consist of eight sub-tests, two 
semester sub-tests, and four end-of-the-year sub-tests on struc¬ 
tures and functions, biological principles, and the major gen¬ 
eralizations of scientific attitudes and scientific methods. 

The Curtis * tests in general science consist of fourteen 
fifty-item tests which may be used for diagnosis and four longer 
appraisal examinations in addition to one each on scientific 
principles or generalizations, on scientific attitudes, and on 
scientific method. The teacher who uses such tests properly 
and faithfully and follows up with the remedial work indicated 
will have few if any unenthusiastic or unsuccessful students. 
The following are sample items for Unit Nine: 


13. The telegfraph sounder completes and breaks . 13 

the circuit when messages are being sent. 

14. Sounds are interpreted by the (o) outer ear; . 14 

(6) eardrum; (c) middle ear; (d) inner ear; 

(/) bones in the ear. 

15. Every phonograph in order to reproduce the. 15 

record must have a (a) steel needle; (b) dia¬ 
phragm; (c) wooden case; (d) handle for wind¬ 
ing the motor; (e) curving horn; (/) pleasing 

tone. 

16. The telegraph key acts in much the same way... 16 


as the (a) armature of a dynamo; (&) vibrator 
of an electric bell; (c) switch; (d) fuse; (g) poles 
of an electric bell. 

17. A motionless body rarely gives off sound waves. 17 


^ Curtis, F. D. Tests; Science for Today. Boston. Ginn and Company, 1936. 
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18. Write in. the blank the numbers of the follow- .. l8 

ing words in the order of speed and ease with 
which sounds pass through them, beginning 
with the one through which sounds pass most 
readily: (i) copper bar; (2) partial vacuum; 

(3) water; (4) a vacuum; (5) oxygen. 

The Ruch-Popenoe General Science Test * consists of two 
parts, one of multiple-choice items on facts, information, and 
understandings, and the other identifications and problems. 
Figure 54 presents one item from Form A, Part II. 

Among the outstanding tests in physics and chemistry are 
the following: Powers General Chemistry Test (high school); 

a The mechanical advan¬ 
tage of this pulley sys¬ 
tem is. a 

h The rule for the mechani¬ 
cal advantage of any 
pulley system is that the 
mechanical advantage is 
equal to the number of 
times the cord passes to 

and from the. 

pulley. b 

c Disregarding friction, the 
force needed to lift the 
100-pound weight shown 
is.pounds. c 

FIG. 54. SAMPLE ITEM FROM RUCH-POPENOE GENERAL 
SCIENCE TEST 

• Ruch, G. M., and Popenoe, H. F. General Science Test. Copyright, 1923, by World 
Book Company, Yotikere-on-Hudson, N.Y. Reproduced by written permission. 
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George Washington University General and Organic Chemis¬ 
try Tests (college); Iowa Placement Examinations in Chem¬ 
istry; Columbia Research Bureau Physics Test (high school 
and college); Co-operative Physics Test (high school and col¬ 
lege) ; and the Michigan Instructional Tests in Physics. Over 
a hundred tests are available in the various sciences, most of 
them of the appraisal rather than the diagnostic type and 
almost entirely for use on the high-school level. 

Social studies. More than fifty tests in geography, fifty in 
general social studies and civics, and one hundred in history 
are available, most of them of the appraisal type on the high- 
school level. In geography, the Buckingham-Stevenson 
United States, the Posey-Van Wagenen Series, and the Witham 
Tests have been widely used. The American Council Civics 
and Government Test is prepared for both high-school and 
college use and contains an excellent sampling of the most 
important objectives in this field. The following items are 
taken from the Denny-Nelson * American History Tests, 
Form B, which have been used with satisfaction in the lower 
years of the secondary school: 

C. Directions: Each President’s name below is given a number. 

Show by writing the correct number in the parentheses the adminis¬ 


tration in which each event occurred. 

Jefferson (i) 46. Firing on Fort Sumter. ( ) 

47. Settlement of the Alabama Claims. ( ) 

Jackson (2) 48. Pure Food Law... ( ) 

49, Fulton’s success with the steamboat. ( ) 

Lincoln (3) 50. The Aaron Burr conspiracy. ( ) 

51. Nullification movement in South Carolina.. . ( ) 

Grant (4) 52. The United States begins work on the Panama 

Canal. ( ) 

53. Completion of first transcontinental railroad. ( ) 

Roosevelt (5) 54. The Centennial Exhibition at Philadelphia. . ( ) 

55. First wide use of the "spoils system’’. ( ) 


• Denny, E. C.. and Nelson. M. J. American History Test. Copyright, 1928, by World 
B »ok Company, YonJters-on-Hudson, New York. Reproduced by written permission. 
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The Ely-King ^ Tests in American History are also intended 
for the junior high school. Each of the seven parts of the test 
may be used for diagnostic purposes, and the test as a whole 
for appraisal at the end of the course. The following are 
sample items: 

I. On the left, below, is a list of causes; on the right is a list of results. 
In the parentheses preceding each cause place-the letter of its result, 

1. ( ) Alien and Sedition laws (a) Informal war with France 

2. ( ) Impressment of Ameri- (b) Hard times in United States 

can seamen (c) Lewis and Clark Expedition 

3. ( ) Opposition of Southern (d) Doctrine of Nullification 

states to assumption of (e) Hartford Convention 

state debts by national (f) War of 1812 

government (g) Claim of United States to 

4. ( ) X. Y. Z. Affair Oregon 

5. ( ) Purchase of Louisiana (h) Location of new United 

6. ( ) Exploration of Gray States capital 

7. ( ) Embargo Act 

8. ( ) Opposition of New 

England toward War 
of 1812 

The Interpretation Tests ^ in American History by the same 
authors are prepared in two equivalent forms and test-content 
includes civic attitudes, appreciations, and interpretations, 
with a minimum of purely factual items. 

The Brown-Woody Civics Test “ consists of three parts: 
civic vocabulary, civic information, and civic thinking. The 
following item is illustrative of the content of Part HI, Form B; 

Directions: Imagine that you are a voter. There are two candidates 
seeking the office of mayor of your city. The qualities and char¬ 
acteristics of each are listed below. We shall call one Mr. A and the 
other Mr. B. You are to select the candidate who is, in your opinion, 
best fitted to perform the duties of that office. Read carefully the 

^ Ely, L. A,, and King, E. Tests in American History and Interpretation Tests in American 
History. Los Angeles. Calif,, California Test Bureau, 1929. 

■ Brown, A. W., and Woody, Clifford. Civics Test. Copyright, 1926, by World Book 
Company, Yonkera-on-Hudaon, New York. Reproduced by written permiasion. 
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characteristics and qualities of each before you make your decision. 
Place an X in the square following the name of the candidate for 
whom you would vote. 


Begin here. 

Mr. A 


□ 


Mr. B 


□ 


Legally qualified for office 
Previously member of city 
council 

College graduate 
Deliberate in judgment 
Fearless in politics 
Church supporter 
Fair in business 
Endorsed by merchants 
Member of Board of Education 
Possesses many firm friends 
Insists on law enforcement 
Works well with others 
Respected citizen 
Understands city needs 
Cashier of local bank 
“Booster for city” 

After you have selected one of 
mark (y/) before the qualities or 
choice. 


Legally qualified for office 
Formerly city postmaster 
University graduate 
Quick-witted 
Supported by politicians 
Church member 
“Guardian of city treasury” 
Likes popular favor 
Member of Patriotic League 
Boasts many acquaintances 
Successful corporation lawyer 
Well-known sportsman 
Kind to his family 
Member of important fraternal 
orders 

Manager of local industry 
Big taxpayer 

the above candidates, place a check 

characteristics that determined your 


This field offers among the most difficult problems in diag¬ 
nosis and appraisal. The conflicting political philosophies 
and economic interests in the community, as well as among 
groups which develop, revise, or at least influence curriculum 
making, cause a constant tendency to shift from fundamental 
facts, principles, and problems to movements and institutions 
with argument and bias substituted for facts and evidence. 
Sudden political upheavals and social and economic results 
conspire to upset established viewpoints and require constant 
modification of interpretation of world events. Pressure 
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groups are constantly attempting to propagandize their par¬ 
ticular philosophies. 

There are difficulties also in more specific aspects of the field. 
The development of character and personality, usually de¬ 
scribed in terms of a series of more or less presumably discreet 
concepts, is dependent to a degree upon the uncertain back¬ 
ground described above. Such traits as honesty, punctuality, 
and co-operation, furthermore, are not the simple concepts in 
which most laymen as we.-l as teachers believe, but complex, 
even though more or less persistent, adjustment patterns. The 
same individual often exhibits rather wide differences of 
ethical conduct in different situations. Characteristic behav¬ 
ior patterns are influenced by factors other than abstract prin¬ 
ciples and ethical ideals. 

The major learning difficulty in certain areas of the social- 
studies field, however, is not measurement at all, but the fact 
that teachers and experts do not know specifically enough 
just what their objectives are. If they can agree upon a few 
fundamental specific aims, tests can easily be developed to 
appraise these outcomes. As long as they insist upon thinking 
in terms of such vague generalities that they cannot understand 
each other, social-studies tests cannot be very satisfactory. A 
good test is a sample of important and relatively specific 
objectives. As long as social scientists believe that their 
inability to describe a particular trait mathematically, or, 
better, to identify the quantitive aspects of adjustment pat¬ 
terns, settles the issue for all time and relieves them of further 
responsibility, the situation will continue. Inability to de¬ 
scribe a trait or adjustment pattern, either in terms of its 
nature, antecedents, or consequents, however, is evidence of 
confusion and ignorance. If a reality which corresponds to the 
concept of a particular trait actually exists, evidences of it can 
be found. If no such evidence exists, the reality is in doubt, 
and discussion must remain academic in nature. 
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Thus, instead of believing in punctuality as a general vague 
mysterious quality developed in an even more mysterious 
manner, we analyze it into specific instances of such behavior as 
arriving on time, handing in work on time, and doing other 
things on time. We find that to conform to reality and make 
the definition meaningful, we are forced to describe it mathe¬ 
matically, or in terms of the relative number of satisfactory and 
unsatisfactory acts in relation to what individuals on the av¬ 
erage do under similar circumstances. 

Language diagnosis. Among the objectives which have 
been stressed in classical and modern foreign languages are 
the following: 

1. Ability to read for pleasure 

2. Knowledge of grammar and constructions 

3. Ability to write the language 

4. Attainment of culture through a knowledge of peoples and their 
institutions 

5. Ability to speak the language and use it as a tool of intercourse 

6. Ability to use the language in business and the professions 

7. As a tool for learning better English 

8. As a tool for research 

All language diagnosis (and appraisal) should be in terms of 
the objectives established. One of the obstacles to learning 
has been the large number of new words and constructions 
which students have been expected to master in relatively short 
periods of time; hence the importance of diagnostic vocabulary 
and construction tests, particularly the former. Reading 
tests may consist of simple translations set up in form con¬ 
venient for diagnosis. 

Whereas the “pony” English translation of selections read 
was once prohibited, there is some evidence to indicate that 
reading ability is increased faster with such an aid than with 
the exclusive use of the dictionary. Ability to speak must be 
diagnosed and appraised in individual tests, and most of this 
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activity must necessarily proceed as part of regular class exer¬ 
cises. Where records are made for pronunciation, enuncia¬ 
tion, or accent, instructors may diagnose at their leisure. 
Individual oral examinations of the type of the Gray Oral 
Check Tests in Reading are also helpful in diagnosing oral 
language difficulties. 

One of the most difficult problems on the college level is 
diagnosis and appraisal of the abilities of candidates for ad¬ 
vanced degrees. A “reading knowledge” of a foreign lan¬ 
guage is a very elastic requirement. How much ability is 
needed for using various foreign languages as tools of research 
could be determined much more definitely by vocabulary and 
construction studies of the materials available for research, and 
diagnosis and appraisal of the candidate’s language abilities 
made in the light of these objectives. For the most part, 
present appraisal and diagnosis is superficial and subjective, 
and candidates consider this requirement a mere hurdle. One 
of the immediate necessities is research on the language abili¬ 
ties needed to utilize available materials in the various fields. 

Over a hundred tests are available in French, German, 
Italian, Spanish, Chinese, and Latin. They consist mainly of 
prognosis and achievement or appraisal tests. Since growth 
in language ability is confined to a relatively few variables, 
appraisal tests have definite value as points of departure in 
diagnosis. Errors and difficulties are relatively easy to 
identify. 

Among the tests which have been widely used are the fol¬ 
lowing: American Council, Columbia Research, and Co-oper¬ 
ative Tests in French, German, and Spanish; the Broome- 
Brown Test in French; and the Contreras-Broome-Kaulfers 
Spanish Test. 

Diagnosis and measurement in other fields. The use of 
tests and other diagnostic and appraisal instruments is growing 
in other fields. In schools of nursing, for example, intelligence 
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tests, achievement tests, tests of mechanical aptitude, and 
adjustment inventories are used as selective devices. Regular 
diagnostic and appraisal tests are being developed to discover 
why automobile drivers have accidents. Many such drivers 
must now take performance as well as paper-and-pencil tests, in 
addition to the regular vision and hearing tests, for licenses to 
drive. Journalism is developing achievement scales, and score 
books for evaluating the merit of newspapers, as well as tests 
for other outcomes in this field. The conviction that adequate 
diagnostic and appraisal testing is essential to guiding learning 
and adjustment in any field is gradually forcing its way into 
the consciousness of thoughtful teachers everywhere. 

Admission, to and graduation from college. For the most 
part, high-school graduates are still admitted to college on the 
basis of a high-school diploma or transcript. Some colleges 
require credit in certain definite fields; others require certain 
definite recommended units. When the student cannot qual¬ 
ify on either of these bases he is usually allowed to take an 
entrance examination of the college board type. The prin¬ 
cipal’s recommendation is also frequently required. 

The wide differences of ability among those who present the 
same credentials constitutes a difficult guidance problem on'the 
college and university level. Whereas we may argue the 
theory that the interests of society and rights of taxpayers 
demand that students be kept in high schools or even admitted 
to public universities to profit as they can, graduation from 
such institutions of higher learning should, but does not at 
present, guarantee any particular degree of competence. For 
the most part it indicates length of attendance, and courses 
taken regardless of the marks assigned. 

Responsibility for improvement of this situation must be 
shared by both institutions. The high school must send stu¬ 
dents to college better prepared, and the college must greatly 
improve its techniques and services to students, but it must 
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also refuse more and more resolutely as time goes on to put a 
false stamp of approval on those who have not attained the 
objectives of the curricula selected. 

Diagnostic and appraisal testing will not alone solve the 
problem on either level, but they will require a much more 
definite determination of objectives, which is at present greatly 
needed. It will make possible a frequent checkup in terms 
of these objectives, and it will enable instructors to obtain 
meaningful data, to be used with other evidence, to determine 
graduation. 

A mark in a subject at present means relatively little because 
of the unreliability of marks both in high school and college. 
For example, instead of admitting to college for more advanced 
courses students with A’s or B’s in stenography and type¬ 
writing, and arguing the question of whether two periods of 
such work count as one period of English, entrance require¬ 
ments may be stated in terms of rate of writing and transcrib¬ 
ing shorthand and rate of typewriting with a specified number 
of errors. In a similar manner, objectives and evidences of 
attainment in other fields should be made objective, definite, 
and valid. Tests for diagnosis and appraisal should not be 
used spasmodically, half-heartedly, and amateurishly, but 
should be developed as a regular part of the teaching-learning 
routine. Evidence of competence in this aspect of teaching 
ability should be required as definitely as evidence of subject- 
matter mastery, before instructors are certified and employed. 

QUESTIONS AND EXERCISES 

1. Outline the general method of diagnosing learning difficulties in the 
major fields. 

2. Explain the diagnosis of the causes of reading difficulties. 

3. What would you do about reading difficulties on the secondary level? 

4. What should be done with the college student who cannot read 
adequately? 
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5. What factors make diagnosis and appraisal in the commercial fields 
simpler than in the social studies? 

6. Explain the techniques used in mathematics; science; languages. 

7. Discuss the problems related to college admission and indicate how 
better methods of diagnosis and appraisal may aid in their solution. 
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CHAPTER ELEVEN 


ABILITY GROUPING AND 
CLASSIFICATION 

The terms “ability grouping-” and “classification” have been 
used more or less synonymously to describe the attempt to 
divide heterogeneous pupil groups into homogeneous groups. 
Establishing special classes for subnormals, dull normals, and 
supernormals; attempting to divide large pupil groups into 
classes of more nearly equal ability; or attempting to classify 
pupils of so-called normal classes into several more homogene¬ 
ous groups, are common examples. The fundamental claim of 
those -who have favored ability grouping has been that pupils 
•would learn more, or faster, as members of homogeneous 
groups than as members of heterogeneous groups. One, if not 
the principal, factor in most classification schemes has been 
intelligence (M.A. or I.Q., or both). All plans for grouping 
were defended on the basis of argument rather than evidence. 
More recently, chronological age, social maturation, pure 
heterogeneity, and other factors have been advocated. 

Arguments favoring ability grouping. An extensive liter¬ 
ature has been produced in favor of ability grouping. Many 
arguments like the following have been added or amplified from 
the original claim that greater or faster learning results from 
such grouping: 

1. Children -work better in homogeneous groups. 

2. Children are more social-minded in homogeneous groups. 

3. Children accomplish more in homogeneous groups than when 
they are unclassified. 

4. Grouping increases the enthusiasm of teachers, who therefore 
work more effectively. 
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5- Discipline problems disappear under homogeneous grouping. 

6. Pupils invariably adjust better in homogeneous groups. 

7. Superior students profit from the keener competition of homo¬ 
geneous groups. 

8. Moral deterioration follows heterogeneous grouping since so 
many will not work up to their best possibilities. 

9. Inferior students are frustrated and defeated in heterogeneous 
groups. 

10. Homogeneous grouping reduces retardation. 

11. Ability grouping is inevitable outside of school; grouping in 
school is simply taking advantage of a natural law. 

12. Teaching is easier with homogeneous groups. 

13. Children are happier in homogeneous groups. 

14. Instruction can be adjusted to pupil ability. 

15. More pupils succeed at more things. 

16. If heterogeneous grouping is carried to its logical conclusion, 
idiots and geniuses must be taught in the same class. 

Arguments against ability grouping. Among the arguments 
against grouping are the following: 

1. Grouping is undemocratic; it will lead to deterioration of demo¬ 
cratic ideals and institutions. 

2. Grouping results in the development of undesirable personality 
traits, such as inferiorities and superiorities. 

3. A co-operative class spirit cannot be developed when children 
are divided in this manner. 

4. There is no inspiration for dull pupils. 

5. Bright children overwork. 

6. Grouping is fatalism and determinism; it assumes that individ¬ 
uals do not change. 

7. Undesirable competition and selfishness are fostered in homo¬ 
geneous groups. 

8. Instruction is not adapted to groups of varying ability. 

9. Groups cannot be kept homogeneous because children are con¬ 
stantly changing. 

10. Grouping is not a substitute for teaching. 

11. Grouping as carried on is largely arbitrary and much harm is 
done. 

12. Parents are against ability grouping. 
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13. Homogeneous grouping aids in the perpetuation of the tradi¬ 
tional emphasis on static abstractions. 

14. Grouping impedes social progress. 

15. Children most in need of help are branded as educational failures. 

Studies in ability grouping. Many studies of ability group¬ 
ing were attempted during the argumentative period, but most 
of them were naive and erroneous and produced no real evi¬ 
dence in favor of either side. In many cases, the basis of 
classification was not definite; in others, the two groups (classi¬ 
fied and not classified) were treated differently, but results 
obtained were related directly and only to the fact of classifica¬ 
tion ; in still other cases, the same individuals who favored a par¬ 
ticular point of view not only grouped pupils but also marked 
them at the end of the studies and then offered these marks as 
evidence of the superiority of their viewpoints. Many studies 
used no control groups. Often the method of determining 
final relative achievement was impossible to determine. Fi-’ 
nally, some studies used a combination of criteria for grouping 
which involved personal judgment to such an extent that had 
a method of homogeneous grouping, or one that produced a 
maximum achievement, been discovered, no one could have 
duplicated the feat. 

Miller and Otto ^ investigated a large number of studies and 
reported that, basically, all were concerned with achievement; 
other possible advantages of grouping were ignored. Among 
their results were the following: 

1. Grades included: Ten of the studies are confined to grades six to 
twelve; seven largely to the elementary grades; two to college 
work; and in one the grade is uncertain. 

2. Bases for grouping: Various combinations of mental age, achieve¬ 
ment scores, teachers' marks, and the like have been used, many 
of the studies not relying upon a single criterion for classification. 

» Miller, W. S., and Otto, Henry. “Analysis of Experimental Studies in Homogeneous 
Grouping"; in Journal of Educational Research, vol. 21, pp. 95-102. (February, 1930.) 
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Rock ^ made a very comprehensive study and reported no 
statistically or educationally significant differences in the 
achievement of pupils of equal ability in heterogeneous and 
homogeneous groups. His criticisms of studies in general are 
as follows: 

1. Lack of provision for differentiation of teaching methods or cur¬ 
ricula. 

2. Experiments as a rule were too short. Habits built up during 
four or eight years cannot be changed quickly. 

3. Certain gains made in homogeneous groups cannot be measured 
through the ordinary subject-matter tests. Teachers are gener¬ 
ally favorable to ability grouping, hence some interpretations of 
data are probably influenced by this factor. 

Individual studies in ability grouping. Among the more 
carefully done studies in ability grouping are those of Clark,® 
Nettels, Kefauver, Purdom, Breed, Keliher, and Souvain. 
Clark reclassified pupils of four elementary schools, with aver¬ 
age I.Q.’s below 100, on the basis of reading ability primarily; 
arithmetic, language, and the like received secondary attention. 
The achievement of all four schools was raised considerably 
above average. Teachers’ judgments were obtained as to 
whether or not this method of classification was superior to, the 
same as, or inferior to the regular counselor procedure. A sum¬ 
mary of these judgments is presented in Table 25. 

Nettels 3 studied the problem of classifying pupils in mathe¬ 
matics and reached the following conclusions; 

The best single factor in predicting success in mathematics 
was the Los Angeles Test in the Fundamentals of Arithmetic. 
The correlation between arithmetic scores and success in alge¬ 
bra was .80. The correlation between the Los Angeles Rea- 

' Rock, Robert R., Jr. “A Critical Study of Current Practices in Ability Grouping"; in 
Educational Research Bulletin of the Catholic University of America, vol. 4, nos. 5 and 6. 
(May and June, ipap.) 

= Clark, Willis W. “An Experiment in Grading and Classifying Four Elementary 
Schools”; in Los Angeles Educational Research Bulletin, vol. 7, no. 8. (May, 1928.) 

’ Nettels, C. H. “Classifying Pupils in 7B Mathematics’in Los Angeles EducationaJ- 
Research Bulletin, vol. 7, no. 8. (May, ipaS.) 
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Table 25. Teachers’ Reactions to Clark’s Method of 
Classification 


Items 

Per Cent 
Superior 

Per Cent 
Same 

Per Cent 
Inferior 

I. Opportunity for fitting instruc¬ 
tion to pupil needs. 

90.0 

8.3 

I -7 

2. Use of test results for improving 
instruction... 

00 

00 


0.0 

3. Attitude of pupils toward school 
work. 

88.3 

wKM 

4.0 

4. Homogeneity of reading group. 

85.0 


3-6 

5. Homogeneity of social maturity. 

833 


9.0 

6. Opportunity for remedial activi¬ 
ties . 

83-3 

14-7 

2 .0 

7, (Importunity for evaluating the 
effectiveness of teaching. 

80.0 

14.0 

6.0 

8. Homogeneity of mental matu¬ 
rity . 

80.0 

14.8 

5-2 

9. Homogeneity of physical ma¬ 
turity . 

68.3 

22.1 

9.6 

10. Opportunity for enriched activ¬ 
ities.. 

60.0 

23.0 


11- Simplification of discipline prob¬ 
lems . 

40.0 

48.5 


12. Homogeneity of arithmetic 
group. 

38.3 

26.4 

BH 

13. Discussion of pupils’ accom¬ 
plishment with parents. 

33-3 

65-1 

■ 

14. Homogeneity of groups in spe¬ 
cial subjects. 

28.3 

44-5 

27.2 


soning Test and success was .64, and between I.Q. and success 
.48. 

A combination of the scores on the reasoning test and I.Q.’s 
with the scores on fundamentals raised the correlation to .86. 
This combination was made by the use of the regression equa- 
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tion. The scores on the reasoning test were multiplied by 
three, and those on the fundamentals test by two. With these 
results, pupils may be ranked in groups for instructional pur¬ 
poses in mathematics, with the expectancy of success indicated 
by the above coefficient. 

Kefauver * investigated the prediction of the junior-high- 
school success of 110 sixth-graders. He used nine bases of 
prediction: 

1. Mean of teachers’ marks (in fifth and sixth grades). 

2. Teachers’ estimates of capacity. 

3. Teachers’ estimates of application. 

4. Multi-Mental test score. 

5. Multi-Mental I.Q. 

6. Thorndike-McCall Reading T-score. 

7. Woody-McCall mixed fundamentals. 

8. Monroe Reasoning test scores. 

9. Composite of the above eight. 

His conclusions were, briefly, as follows: 

1. The most important single source of information for predicting 
success in the first year of junior or high school was the judgment 
of the teachers in the elementary school. 

2. Although the “composite” of the factors listed above ranked sec¬ 
ond in predicting high-school success, the second best “single 
source” of information was the I.Q. 

3. Next in importance were "teachers’ marks” and mental-test s<V.»re 
or “mental age.” 

4. When two factors were used, the best combination was that of 
teachers’ estimates and intelligence quotients; the next best, 
teachers’ estimates and mental age. 

Purdom,® using the experimental and control-group tech¬ 
nique and standardized achievement tests, reached the fol¬ 
lowing conclusions: 

^ Kefauver, Grayson N. Teachers College Record, pp. 99-113, vol. 3r, no. 2. (Novem* 
her, 1929.) 

» Purdom, L. T. Value of Homogeneous Grouping. Baltimore, Warwick and York, 1929. 
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1. Pupils in homogeneous sections do not gain more than pupils in 
heterogeneous sections when the results are measured by standard¬ 
ized tests. 

2. Pupils in the homogeneous sections make lower semester grades 
in English, but higher grades in algebra. 

3. Pupils in the homogeneous sections do not cover more course 
material. 

4. The semester grades do not show that the pupils in the homogene¬ 
ous sections put forth greater effort. 

5. The gains made on the standardized tests and the semester grades 
do not show that the pupils of any degree of intelligence were 
favored by homogeneous grouping. 

6. Homogeneous grouping on the basis of the intelligence test does 
not reduce failures. 

7. The semester grades given by the teachers do not agree with the 
opinions expressed by the teachers, even though the semester 
grades were made on the basis of the daily classwork, monthly 
tests, and final examinations. 

Keliher * criticized all previous studies because they con¬ 
ceived and recognized individual differences only in terms of 
subject-matter, and attempted modifications of the school pro¬ 
gram only in terms of scholastic factors. She emphasized the 
importance of recognizing the “whole-child” concept in any 
form of grouping. Her conclusions were as follows: 

1. Homogeneous grouping is not in accord with progressive theory. 

2. It is not in harmony with recent research in biology, psychology, 
physiology, and mental hygiene. 

3. It does not accomplish its avowed purposes. 

4. Its further use in the elementary schools seems unjustified. 

Breed studied five methods of classification and compared 
the results. He used the following methods of grouping: 

1. Achievement. 

2. Intelligence. 

* Keliher, Alice V. A Critical Study of Homogeneous Grouping with a Critique of Measure¬ 
ment as the Basis of Classification. Teachers College Contributions to Education, no. 453. 
New York, Columbia University, 1931. 

» Breed, Frederech S. Classroom Organisation and Management, pp. iis-16. Yonkers- 
on-Hudson, New York, World Book Company, 1933. 
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3. Ratio between achievement and intelligence. 

4. Average between achievement and intelligence. 

5. Achievement supplemented by intelligence. 

He concluded that the third method was least desirable and the 
fifth most accurate, although the last two methods gave identi¬ 
cal results. 

Souvain ^ attacked the problem from another angle, obtain¬ 
ing the opinions of professional and non-professional groups. 
These responses indicated sharply conflicting opinions ranging 
from highly favorable to definitely unfavorable reactions to 
grouping. Among the factors which are related to responses 
about grouping are the following: 

1. Teachers and parents of children in “bright” sections are more 
favorable to grouping than are others. 

2. Where the I.Q. is weighted too heavily in grouping, teachers 
question the effect on the resulting development of social atti¬ 
tudes. 

3. Teachers’ statements of their educational philosophies and their 
responses to the study appear not to be closely related. 

4. Teachers in schools where attempts have been made to differen¬ 
tiate curricula are more in favor of grouping than are others. 

5. Attempts to differentiate curricula in harmony with ability 
groups have not received the favorable opinion of parents. 

6. Teachers in “low-class” communities feel surer of the value of 
grouping, but are also more inclined to avoid teaching the slow 
groups. 

7. The sex of the parent answering the questionnaire appears to have 
no particular relation to the type of answer. 

8. Teachers in lower grades‘are more favorable to grouping than 
are teachers in upper grades. 

9. Teachers who have always attempted to group pupils like group¬ 
ing better than those who have taught without grouping. 

10. Few teachers tell their pupils in what ability groups they are 
classified. 

* Souvain, Walter H. A Study of the. Opinions of Certain Professional and Non-Profes¬ 
sional Groups Regarding Homogeneous or Ability Grouping. Teachers College Contributions 
to Education, no. 596. New York, Columbia University, 1934 - 
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To what extent is homogeneity possible? Failure to detect 
any significant differences in achievement between presumably 
homogeneous and admittedly heterogeneous groups long ago 
led many students to question the significance of the differences 
in homogeneity which was attained to begin with in various 
studies. The development of the California Tests of Mental 
Maturity * made it possible to break down large numbers of 
M.A.’s and I.Q.’s with comparative ease and to confirm the 
point of view. The following data are typical of the results of 
studies which have been conducted. 

Homogeneity of pupils with identical I.Q.’s. In the past, 
classification ranges of eight to ten points in terms of I.Q. have 
been considered satisfactory. When this range could be re¬ 
duced to five points, workers felt that they had attained a high 
degree of homogeneity. Further analysis of pupils of identical 
I.Q.'s, however, reveals that they arestill far from homogeneous. 
In general, the larger the group the greater the heterogeneity. 
For this reason, small groups are used in the following tables so 
that the range of individual differences may not be exaggerated. 
Table 26 presents data on twenty-five fifth-grade pupils all of 
whom had obtained I.Q.'s of 106 on the California Tests of 
Mental Maturity, Elementary Battery. This test is organized 
to yield separate measures of memory ability, spacial or visual¬ 
ization ability, mathematical and logical reasoning, verbal abil¬ 
ity, and language and non-language ability — all of which 
appear to be more or less independent factors in mental ma¬ 
turity, and all of which have in the past been hidden in a single 
or total M.A. or I.Q. which obscured their nature and varying 
manif estati ons. 

Table 26, therefore, reveals how heterogeneous pupils of 
identical I.Q.’s may really prove to be upon adequate analysis, 
and explains why the correlation between I.Q. and achievement 
is not higher. 

* These are group tests, which are used with large numbers of cases. Published by Cali¬ 
fornia Test Bureau, Los Angeles. 
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Table 26. The Heterogeneity of Twenty-Five: Pupils with 
Identical I.Q.’s of 106 


I. 


Factors 


Ranges in Months 


Memory ability 


120 months M.A. to 180 months M.A., or 60 
months of M.A. 


2. Spacial or visualiza¬ 
tion ability. 

3. Reasoning ability. . . . 

4. Verbal ability. 

5. Language ability. . . . 

6. Non-language ability 

7. Chronological age.. . . 


loi months M.A. to 180 months M.A., or 79 
months of M.A. 

130 months M.A. to 159 months M.A., or 29 
months of M.A. 

98 months M.A. to 156 months M.A., or 58 
months of M.A. 

120 months M.A. to 149 months M.A., or 29 
months of M.A. 

121 months M.A. to 156 months M.A., or 35 
months of M.A. 

121 months C.A. to 139 months C.A., or 18 
months of C.A. 


Table 26 is interpreted as follows: Twenty-five fifth-grade 
pupils of 106 I.Q. vary in memory ability from 120 months 
mental age (the average of those who are 120 months or 10 years 
old chronologically) to 180 months mental age (average of those 
who are 180 months or 15 years old chronologically), or a range 
of 5 years. In spacial or visualization ability they vary 79 
months, or 6 years and 7 months, and so forth. Incidentally, 
this group proved to vary only 18 months or years chrono¬ 
logically. Thus when I.Q.’s are broken down into the factors 
which together produce them, it is clear that identical I.Q.’s do 
not mean identical ability or abilities. 

Homogeneity of pupils with identical M.A.’s. The I.Q. for 
each pupil is, of course, obtained up to a minimum of fifteen 
years by dividing his M.A. by his C.A. In other words, there 
is obtained a total M.A. from the test. Until recently this 
total M.A. could not be broken down, and yielded a false ap¬ 
pearance of homogeneity among pupils of identical or nearly 
identical M.A.’s. These total M.A.’s for each student may now 
be broken down into the M.A.’s for the various factors as pre¬ 
sented in Table 27. 











272 DIAGNOSIS AND THE DIRECTION OF LEARNING 

Table 27. Heterogeneity of Twenty-Five Pupils with 
Identical M.A.'s of 138 Months 

Factors Ranges 

1. Memory ability. 102 months M.A. to 176 months M.A., or 74 

months of M.A. 

2. Spacial or visualiza¬ 
tion ability. 99 months M.A. to 182 months M.A., or 83 

months of M.A. 

3. Reasoning ability. . . . 123 months M.A. to 152 months M.A,, or 29 

months of M.A. 

4. Vocabulary or verbal 

ability. 100 months M.A. to 156 months M.A., or 56 

months of M.A. 

5. Language ability .... 128 months M.A. to 151 months M.A., or 23 

months of M.A. 

6. Non-language ability 122 months M.A. to 151 months M.A., or 29 

months of M.A. 

7. Chronological age... . 120 months C.A. to 154 months C.A., or 34 

months of C.A. 

Table 27 is interpreted as follows: Pupils who obtain the 
same total M.A.’s on intelligence tests vary considerably in the 
factors which together produce the total M.A. In memory 
ability this variation extends from an M.A. of 102 months, or 
from the average of those who are 102 months old chronologi¬ 
cally, to an M.A. of 176 months, or to the average of those who 
are 176 months old chronologically, a total range of 74 months, 
or 6 years and 2 months. The variation among pupils with 
identical total M.A.’s of 138 months ranges from 99 months to 
182 months, or 83 months (6 years and 11 months) in spacial or 
visualization ability. The range in chronological age proved to 
be 34 months, or 2 years and 10 months. 

Homogeneity of pupils of the same chronological age. If we 
study pupils of the same chronological age, we are again im¬ 
pressed by their inevitable heterogeneity. Table 28 presents 
ranges in M.A. in six major factors and total range in I.Q.’s. 

Table 28 is interpreted as follows: Pupils of the same chrono¬ 
logical age vary considerably in ability. Twenty-five pupils 
each 129 months old chronologically ranged in memory ability 
from a mental age of 121 months, or the average of those 121 
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Table 28. Heterogeneity of Twenty-Five Pupils, All of 
129 Months C.A. 


Factors 

Ranges 

1. Memory ability.... 

2. Spacial relations or 
visualization ability. . 

3. Reasoning ability. . . . 

4. Vocabulary or verbal 

ability. 

5. Language factors .... 

6. Non-language factors 

7. I.Q.'s. 

121 months M.A. to 180 months M.A., or 59 
months of M.A. 

82 months M.A. to 196 months M.A., or 114 
months of M.A. 

114 months M.A. to 172 months M.A., or 58 
months of M.A. 

115 months M.A. to 174 months M.A., or 59 
months of M.A. 

126 months M.A. to 168 months M.A., or 42 
months of M.A. 

109 months M.A. to i88 months M.A., or 79 
months of M.A. 

92 r.Q. to 133 I.Q., or 41 I.Q. points 



months old chronologically, to an M.A. of 180 months, or to 
the average of those who are 180 months old chronologically, or 
a total spread of 59 months or nearly five years. The variation 
in I.Q. for this group of identical C.A. proved to be 41 points. 

When each of the six factors of memory ability, spacial rela¬ 
tions, reasoning, verbal ability, language ability, and non¬ 
language ability utilized above are further analyzed, other dif¬ 
ferences are found. Thus it appears that one reason we obtain 
no significant differences in achievement between presumably 
homogeneous and heterogeneous groups is the fact that we 
never achieved any significant degree of homogeneity. 

Grouping on the basis of social maturation is now being 
widely advocated. Obviously, such a plan has certain theoret¬ 
ical advantages. From the standpoint of achieving homoge¬ 
neity, however, the difficulties of grouping on this basis are 
greater, if anything, than of grouping on the basis of somewhat 
more objective M.A.’s and I.Q.’s, or on the basis of chronologi¬ 
cal age. However, the idea has merit and will receive further 
attention. 

Range of maturity in a grouped and an ungrouped class. 
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Figures 55 and 56 represent an analysis of the range of manifes¬ 
tation of certain traits and abilities of a typical ungrouped 
sixth grade and of a grouped sixth grade, with an I.Q. range of 
103 to 107, respectively. An inspection of these figures reveals 
an apparent striking similarity between the two groups, which 
must not, however, be interpreted too literally. These figures 
indicate the range of manifestation of the two groups with 
respect to specific tests and factors, but do not reveal the cen¬ 
tral tendencies or variabilities of these traits and abilities. 
Nevertheless, the similarity is sufficient to raise a serious ques¬ 
tion concerning the extent to which homogeneity is achieved by 
grouping according to I.Q., particularly when the average of 
the I.Q.'s of the groups concerned is not far from 100. 

Other factors related to homogeneity. Careful scientific 
tests again and again reveal accurately a fact which has escaped 
most teachers and is only vaguely appreciated by others; 
namely, that children and youth are growing, changing beings, 
and that even if it were possible to achieve absolute homogeneity 
in a single factor, this condition would persist for only a brief 
period. 

Children do not mature mentally, physically, emotionally, or 
socially at the same rate. Neither does a single individual 
maintain a steady rate of growth; health, motivation, acci¬ 
dents, and changes of environment are among those factors 
which may cause variations. Again, the average rate of mental 
maturation slows down as the chronological age of sixteen years 
is approached. Many have slowed down sooner and stopped 
completely, so far as can be determined, while others have gone 
on maturing for many years. 

The reliability of the tests themselves on which grouping has 
been based is another factor. For example, the scores of some 
of the most reliable intelligence tests have probable errors of 
about five I.Q. points. An individual who obtains an I.Q. of 
100 on one test has an even chance of having his true I.Q. lo- 
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cated between 95 and 105. The chances are also even that his 
true I.Q. is outside these limits; in fact, it may occasionally be 
as low as 80 or as high as 120. Care must be observed in con¬ 
sidering the reliability of I.Q.’s to interpret their meaning. 

Another important factor which has been a source of error in 
grouping studies has been the regression effect from single appli¬ 
cations of a test, or the tendency for some of those who receive 
the highest and lowest scores on a test to receive lower and 
higher scores respectively on repetitions of the same test or on 
applications of similar tests. 

Group heterogeneity and stability of central tendencies. In 
spite of the differences between individual pupils of the same 
I.Q.’s, M.A.’s, or C.A.’s, and the effects of growth, regression, 
and other factors, measurements of central tendencies of ability 
and achievement (means or medians) of relatively heterogene¬ 
ous groups are relatively stable. That is, the average of the 
obtained measurements of I.Q.’s for a group as a whole does not 
suffer from the same disabilities as does the measurement of the 
I.Q. of a single individual. Similarly, the average of a group 
of individual measurements of achievement is relatively much 
more stable than a measurement of achievement of a single 
individual. Therefore, while homogeneity for a group is most 
difficult if not impossible, measures of central tendencies are 
possible and useful. It is for this reason that group achieve¬ 
ment may be predicted from measures of group ability, and 
valid norms and standards may be established in terms of pupil 
needs and capacities. These standards are especially useful in 
building and revising curricula, selecting materials, and in other 
administrative and supervisory guidance. 

Group heterogeneity and individual achievement. While 
central tendencies are relatively stable in spite of the hetero¬ 
geneity of presumably homogeneous groups, measurements of 
growth and development of pupils of identical I.Q.’s or M.A.’s 
are bound to vary greatly because of wide individual differences 



TEST 


FACTOR 


Possible Pupil's 
Score Score 


Mental Age 
Mo. 7 2 84 96 
Yr. e^O 70 8.0 ' 


A. Memory. 

4. Immediate Recall^ . . . . 

5. Delayed Recall ...... 



B. Spacial Relationships.... 

6. Sensing Right and Left* . 

7. Manipulation of Areas*. . 

8. foresight in Spacial Sifns* 

C. Reasoning. 

9. Opposites*. 

10. Similarities*.. 

11. Analogies*.* 

12. Number Series*. 

13. Numerical Quantity* . . . 

14. Numerical Quantity.... 

15. inference.* . 

D. 16. Vocabulary .. 
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not revealed by these identical I.Q.’s and M.A.'s. For exam¬ 
ple, memory plays an important part in most types of learning. 
When memory maturity of a group of pupils, all with I.Q.’s of 
107, varies as much as eighty months, it is plain why individual 
achievement of pupils of the same I.Q. may vary from failure 
to superior. Some pupils with high I.Q.’s fail in certain activ¬ 
ities and succeed in others; some pupils with relatively low 
I.Q.’s succeed in certain activities in which those with higher 
I.Q.’s fail. Total M.A.’s or I.Q.’s are too crude to afford a 
prediction of individual achievement or furnish individual 
guidance. The apparently inevitable heterogeneity of al¬ 
legedly homogeneous groups forces us to break down these 
general measures into the factors of which they are con¬ 
stituted if they are to provide any significant assistance in the 
objective for which ability grouping was conceived and in¬ 
tended. 

Practical dilBS-culties in grouping. These are practical difficul¬ 
ties which would at present seriously limit the use of homogene¬ 
ous grouping, even though we discovered how to achieve it. 
First of all, there are many small schools in which only single 
classes in each grade exist. In some schools teachers have two 
or three grades. In many rural schools teachers have pupils in 
grades ranging from the first to the sixth or eighth. 

The age of entrance is another problem. Even when a defi¬ 
nite chronological age, for example, six, is set, the entering 
group is not in reality exactly or nearly six years old, but fre¬ 
quently ranges from five years and six months or younger to 
seven or eight. Where mental age is utilized, it is usually sup¬ 
plementary, and transition groups are formed, while those who 
are ready take up the regular first-grade activities. However, 
we have seen that while mental age provides a more or less 
crucial standard for beginning certain activities, and adds to 
our knowledge of the child’s capacities, pupils with identical 
mental ages still differ in many ways. 
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Administrative regulations, based on tradition, expediency, 
and good common sense, have decreed that pupils should ad¬ 
vance in rather regular fashion through the grades. Such a 
plan limits the possibility of classifying certain pupils with the 
groups they most nearly resemble on the basis of past tech¬ 
niques. Yet there has been no lack of clever and ingenious 
devices for circumventing the intent of such regulations. How¬ 
ever, the fact that the number of seats available in the next room 
has often been the deciding factor in promotion has gone far to 
wreck any plan of classification, no matter how fantastic. 

Another factor which is coming into the picture is the no¬ 
failure program. Under this plan, the burden of proof is on the 
teacher if the pupil does not pass. In small schools, this plan 
often keeps groups intact and ignores real or imaginary differ¬ 
ences which occur as the result of varying rates of maturation 
and learning. 

Finally, the emphasis on units of work, particularly in the 
primary grades, and, in later grades, in the social studies, sci¬ 
ences, and industrial arts, usually discourages grouping in cer¬ 
tain activities. To group would, in fact, destroy some of the 
theoretical values which such activities possess. 

Thus to the practical impossibility of attaining homogeneity, 
the difficulties which arise from varying rates of maturation, 
unreliabilities of tests, regression effects in testing, and admin¬ 
istrative difficulty, we add another factor, the question of the 
desirability of grouping for certain activities. 

The arguments presented at the beginning of the chapter and 
constantly repeated even today are now seen to be largely 
beside the point. We have never had homogeneous grouping, 
hence we have never experienced its alleged benefits nor suffered 
from its pernicious effects. Children have always learned in 
heterogeneous groups. All who read these lines were members 
of such groups. However, a more comprehensive understand¬ 
ing of the nature of such factors as ability, social maturation, 
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homogeneity, and heterogeneity will aid us in avoiding many 
of the time-consuming and meaningless activities which we 
have utilized in the past in attempting to achieve the appar¬ 
ently impossible. 

Relative homogeneity and learning. Inability to attain the 
absolute or nearly absolute homogeneity which we have vi¬ 
sioned in the past does not prevent us from enjoying certain 
advantages of relative homogeneity. We have long had a rough 
type of ability grouping in our classes or grades; in spite of past 
misunderstanding of what constitutes a class or grade, such an 
administrative device is legitimate and useful. 

Pupils with I.Q.’s above 115 differ from one another in many 
ways, but it may still be profitable to segregate them, provided 
we understand the probable error of the I.Q. and its relation to 
grouping, and provided we learn to adapt learning activities to 
such groups. One of our difficult problems has been the fact 
that learning is directed, if at all, in terms of the interests and 
capacities of teachers rather than of pupils, which is a natural 
result of the direction which teachers have received in teacher¬ 
training schools. 

On the other extreme, pupils of very limited capacity, in 
spite of their differences, may well be grouped together. And 
we have been forced in this area better to adapt learning to the 
nature, needs, interests, and capacities of these pupils since 
some of their characteristics are so definite and resistant. 

The ungraded room has met a real need in many schools. It 
is a conscious acknowledgment, on a wider basis, of what has 
been unconsciously denied in most classrooms, namely, that 
every child is an individual personality and has certain specific 
interests, capacities, and needs. Grouping pupils into classes 
or smaller groups does not relieve us of the responsibility of dis¬ 
covering these individual interests, capacities, and needs. Cer¬ 
tain limitations or defects are legitimate bases of classification, 
quite independently of interests, needs, or capacities. Thus we 
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have special sight-saving classes and classes for the blind, classes 
for the deaf and the hard of hearing, and classes for cripples, 
as well as for those with other defects. 

Nor is it necessary to group pupils only on these bases. 
Where there are many pupils in the same grade or subject 
broken up into smaller groups for instructional purposes, it is 
not only legitimate but desirable to group them on the basis of 
size if fixed seats in various classrooms vary in this manner. 
Since most grouping has little effect on homogeneity, mere 
teacher or administrative convenience is a legitimate reason for 
grouping, provided, of course, that other more important values 
are not sacrificed. 

In three preceding chapters, extended consideration has been 
given to the problem of adjustment diagnosis. Social mat¬ 
uration, as used in this discussion, is an important aspect of 
adjustment, the latter including more specifically intellectual, 
physical, and emotional problems as well as social maturation. 

Although perhaps the most intangible basis of grouping, 
social maturation is theoretically the most important. As far 
as possible, children and youth should be associated with those 
with whom they can best work, play, and live. Learning 
should be an interesting, challenging, joyous adventure. There 
is a general conviction that sane, well-balanced, and effective 
personalities can best be achieved in groups where social ma¬ 
turation is most nearly identical. For this reason, so far as 
practical conditions and limitations will permit, pupils who are 
unhappy, or are failing to adjust in spite of the aid teachers 
may extend, should be moved into more congenial groups. 

Grouping on this basis is not so simple as grouping on the 
bases previously presented. There are no objective tests of 
social maturation. In fact, the concept itself is frequently 
questioned since what is designated as maturation may, in 
reality, be more a matter of personality or social adjustment 
than maturation. Some pupils are more unhappy and less 
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effective when moved to presumably more congenial groups 
even tliough they have failed to adjust properly in the first 
instance. Many pupils cannot be moved to more congenial 
groups because no such groups exist. Careful and systematic 
observation, sympathetic conferences with children and youth, 
and consideration of other evidences of adjustment and success 
must guide the teacher. 

Intra-class grouping. Teachers have long utilized grouping 
within classes. In music, pupils with the best ears and voices 
are seated in the rear seats, and frequently assist other pupils; 
the less apt students profit by the good voices behind them.* 
Where no transition reading groups are established, teachers 
frequently have two or even three different groups within the 
same class, often giving more time to more immature groups. 
Errors in spelling and arithmetic are easily identified; those 
who achieve all objectives for the day frequently engage in a 
new type of activity or do more advanced work, while the 
teacher works with the group which has had difficulties. In 
this case, the personnel of the groups changes from day to day. 
Classes are frequently divided in a social-studies unit, smaller 
groups engaging in different activities for a part of the time. 

In intra-class grouping, the teacher does not as a rule concern 
herself with measures of capacity, but with immediate learning 
and adjustment. Relative success with the objectives or pur¬ 
poses for the day, topic, or unit is the basis of grouping. Tem¬ 
porary reorganization or grouping of the class which will result 
in the most effective learning is the objective. 

When several grades are represented in a single class, there 
is usually less need and less opportunity for this type of group¬ 
ing. Instead, the process is reversed, and pupils of all grades 
may engage in certain activities as a group. 

To serve its purpose, all intra-class grouping should be done 
on reliable evidence. Often the most accurate and quickest 
method of determining relative success is a short diagnostic test. 
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Diagnosis vs. grouping. The more we investigate the prob¬ 
lem of grouping, the more convinced we become of the neces¬ 
sity for constant systematic diagnosis of the interests, successes, 
and needs of every student. No matter how homogeneous a 
group may appear, differences which still exist are much more 
significant for learning than the extent of homogeneity attained. 
Because of these differences, grouping must always be tenta¬ 
tive, must be based on the best evidences available, and should 
be revised as soon as evidence demands. 

Discovery of individual difficulties and needs does not mean 
individual diagnosis. In fact, most individual needs may be 
discovered through group techniques. When our objectives 
and purposes are of the adjustment type, or when they are 
somewhat vague or uncertain, systematic individual observa¬ 
tion is necessary. Yet most observation may proceed without 
disturbing the regular class activities. What is necessary is 
adequate diagnostic record sheets for recording observations to 
use as the basis of future activities. 

Thus through measurement we have discovered and may 
eliminate some of the contradictions, fictions, and useless argu¬ 
ments concerning ability grouping. Clearer aims, purposes, 
and objectives, and more general and constant use of objective 
evaluation of daily successes and needs of pupils as a guide to 
learning activities, will provide in a surer manner the advan¬ 
tages previously ascribed to grouping. 

QUESTIONS AND EXERCISES 

1. Explain several types of ability grouping and classification. 

2. Compare the arguments for and against grouping. Which of them 
do you believe to be valid? Inva.lid? 

3. What major conclusions may be drawn from the studies in ability 
grouping cited in this chapter? 

4. Indicate the possibilities of obtaining homogeneous groups on the 
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basis of I.Q., M.A., and C.A. What criticisms or modifications, 
if any, "would you make of the interpretation of the studies pre¬ 
sented? 

5. Discuss difficulties and problems in grouping on the basis of social 
maturation. 

6. What factors interfere -with most plans of grouping? Have you any 
suggestion for eliminating the effects of these factors? 

7 Discuss the relative stability of averages or central tendencies in 
heterogeneous groups. 

8. Explain -wide variations in achievement of pupils of identical M.A.’s 
or I.Q.’s. 

9. Describe practical administrative difficulties related to grouping, 
and indicate ho'W you would solve them. 

10. Would you utilize grouping in spite of the apparent impossibility of 
obtaining significantly homogeneous groups? Why? 

11. Discuss the relation of the objectives of diagnosis and grouping. 

12. Work out a plan for grouping from grades one to twelve. 
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CHAPTER TWELVE 


CURRICULUM-BUILDING AND 
REVISION 


The function of measurement in curriculum-building is not to 
mechanize, standardize, or supplant the judgments of teachers, 
administrators, or curriculum experts, nor to dictate any phi¬ 
losophy of learning; its purpose is to provide information as 
exact and specific as possible to serve as the basis for judgments, 
decisions, policies, and procedures, regardless of the philosophy 
or the learning activities which are accepted and utilized. 
Measurement takes the curriculum-builder at his own word, 
and then provides definite information which may be used 
together with other types of evidence in curriculum-building. 
The need for curriculum-building and revision. In general, 
curriculum changes should be based on the desire to attain 
new, different, or added objectives, or evidence of unsatis¬ 
factory attainment of objectives already adopted. Important 
obstacles to adequate curriculum-building at present include a 
lack of definite objectives, a confusion of objectives with pro¬ 
cedures and activities, and an evaluation of objectives, attained 
and unattained, which is inadequate in the light of the an¬ 
nounced intentions of the curriculum. Most present-day dis¬ 
cussion on curriculum-building, and most actual curriculum¬ 
building, proceed largely on theoretical and philosophical bases 
which have little or no relation to the facts of pupil growth. 
Such facts are not ascertained even though many can be deter¬ 
mined with relative ease and accuracy. Activities in many 
communities constitute little more than curriculum-tinkering, 
based on the personal ideas of the curriculum expert in charge 
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and modified little if any by evidence of outcomes or conflicting; 
ideas of other experts. 

While the nature and scope of objectives is largely a matter 
of philosophy, and their attainment on different levels a matter 
of the interests, nature, needs, and capacities of the learners, 
evidence of outcomes should constitute the most important 
guidance for curriculum changes, since it is for the specific pur¬ 
pose of producing desirable outcomes that objectives are 
adopted, teachers and administrators employed, and buildings 
and equipment provided. Thus evidence of attainment or lack 
of attainment of objectives will provide the basis for selection 
or changes in textbooks and other materials, for changes in 
courses of study to give less emphasis to certain objectives and 
more to others, and for changes in procedures. 

The problem of objectives. We know little specifically about 
the objectives which we discuss and incorporate in courses of 
study as generalizations, ideals, and attitudes. We repeat old 
or new catchwords as representative of our purposes and then 
build curricula on other bases, because we differ so much on 
their specific implications. 

Among problems and viewpoints which confuse the issues 
and delay progress are the following: 

1. Conscious or unconscious conflict between the rights and privileges 
of the individual as contrasted with his responsibilities to society, 

2. The assumed nature of education. School is life, hence no bonds 
in the form of objectives should limit living. 

3. The problem of change. Since we cannot predict conditions fif¬ 
teen or twenty years hence, there is no chance to provide objec¬ 
tives. 

4. Individual differences. Children and youth differ greatly, hence 
no set of objectives can be adequate. 

5. Organismic nature of child. The child is a whole; to break down 
experiences into objectives is to do violence to the nature of the 
individual. 

6. Personality and character. To develop a natural effective per- 
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sonality the child must be allowed to develop unhampered by 
undue restrictions and limitations, and not made to fit a mold. 

7. Nature of objectives themselves. Objectives cannot be pre¬ 
determined because to a large extent they constitute desirable 
activities and adjustments that occur during the process of 
learning. 

8. Confusion in types of objectives. All education occurs through 
doing; fact-and-skill teaching is bad. We should seek to make 
changes in conduct. 

Before considering these problems and viewpoints, we may 
well ask what relation, if any, have they to measurement. Can 
objective evaluation aid in the clarification of any of the issues 
involved? The answer must be in the affirmative in spite of 
the fact that no perfect test exists, and that evidence of the 
existence of some alleged outcomes is not sufficient to make 
tests possible. This situation is not peculiar to education. 
Medicine also has good tests for certain purposes, less satis¬ 
factory tests for other purposes, and no tests at all in still other 
instances. All measurement must be in terms of objectives, 
and, in general, the more specific and well-defined the objective 
and its evidences, the more accurate and useful its measure¬ 
ment. 

The first problem is primarily philosophical. Some extreme 
thinkers insist that individual growth and development is 
everything, and that the claims of society are either relatively 
minor or will be realized more or less automatically if full free¬ 
dom for individual growth is allowed. Since schools are estab¬ 
lished and maintained to preserve democratic ideals and insti¬ 
tutions, it would appear that objectives must consider the just 
claims of the State, one of which is to learn to live in society as 
it is and as it may be in the future. This involves certain 
definite knowledges, skills, and habits; ability to work effec¬ 
tively, and to earn a living. Those who support schools believe 
that students who profit from these advantages should be able 
and willing to make some practical contribution to the present 
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welfare of society, as well as to be obsessed with a knowledge of 
their personal rights and privileges and the defects of present 
society. Measurement can determine the extent to which such 
outcomes have been attained, as well as aid in determining the 
extent to which they are attainable. 

Normal and effective living involves setting up goals of 
achievement, and planning to meet just responsibilities. Lack 
of definite objectives or planning can hardly contribute to this 
end. 

The problem of change is not a new one. Its significance is 
a matter of degree. The fact that conditions two decades 
hence cannot be predicted with accuracy should not lead to a 
neglect of obvious immediate objectives. Change should not 
imply dropping old objectives and constantly substituting new 
ones; through all change certain basic values persist. Although 
political, social, and economic conditions may change rapidly, 
human nature does not. Teaching-learning procedures, there¬ 
fore, need not be radically changed because the conditions of 
life change, but should change because of our increasing knowl¬ 
edge of the child, his Interests, needs, and capacities. 

The factor of individual differences was revealed through 
measurement, and the solution of related problems depends 
upon the same techniques. Both learning ability and achieve¬ 
ment, or lack of them, are revealed by testing. The effects of 
these wide differences are met by proper grade-placement of 
objectives, by adaptations in procedures which recognize these 
differences, and by proper placement of learners in the school 
organization. Objectives on different levels should be pre¬ 
determined, partially on the basis of an adequate knowledge of 
children and youth, and partially on the nature of their respon¬ 
sibilities, present and future, to society. Additional adjust¬ 
ments to individual differences come through modification of 
procedures. 

The concept of the so-called organismic nature of the child is 
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one which has been held by many over a long period of time, 
and concerns the essential unity of the individual. , This con¬ 
cept is a useful one, but leads to absurdities when carried to the 
extreme. The assumption that because the learner is an or¬ 
ganic unit his learning experiences cannot be broken down into 
subjects, fields, units, or topics, but must stop at some par¬ 
ticular point of specificity, is clearly untenable on the basis of 
present evidence; in fact it is the very organismic nature of the 
child that makes it possible for him to accommodate to wide 
ranges of specificity. It is, however, primarily false logic to 
say that organization of material and processes must parallel 
the assumed nature of the totality of learning potentialities of 
the individual. The organismic psychologist says, for example, 
that it is bad to break down processes of reading into sweeps, 
regressions, eye-span, and the like. No one ever has done 
this to the learner, and these are merely specific aspects of a 
larger whole of which the learner may not be aware. But 
measurements of these individual objectives may reveal the 
desirable specific characteristics of good readers, and show 
where the difficulties of poor readers may lie. Furthermore, 
evidence seems to prove rather conclusively that singling out 
such a specific as eye-movement in reading, and training on 
this specific, may result in a large increase in total reading 
ability. 

The assumed conflict between the development of character 
and personality and the limitations of ‘ ‘ molds ’ ’ introduced by 
the specificity of objectives is largely illusory. If character and 
personality can be modified by educational activities, such mod¬ 
ification should gain rather than lose by definiteness and spec¬ 
ificity. Vagueness and generality of ideas do not lead to 
definite outcomes, but do often reveal the confusion of the 
curriculum-builder. Again, specificity is often confused with 
permanence. Because objectives have become definite does 
not mean that they are to remain fixed; it is, in fact, easier to 
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revise or replace a specific objective than one which no one 
quite understands. Finally, character and personality are not 
the results of activities related to a special group of objectives 
separate from other objectives, but the result of learning activ¬ 
ities related to the whole school program in all its differentia¬ 
tion. Courtesy, co-operation, punctuality, and the like dc 
not develop in a vacuum, but in connection with reading, dra¬ 
matics, and physical education. The expert in character and 
personality education is simply asked to be specific so that 
measurement may aid him in discovering the presence or ab¬ 
sence of alleged outcomes. Psychologists and psychiatrists dc 
not hold the view that specific tasks, even undesirable at times 
are dangerous to the growth and development of personality 
Quite the contrary, the willingness and ability to attack and 
persist at difficult tasks is a test of normal personality. 

Objectives are not limited to specific facts or information or 
the one hand or to vague generalizations on the other hand 
Neither should the curriculum-builder attempt to deal with al 
objectives, but only with the important or crucial specifics 
related to general aims and purposes. Mere activity itself ma^ 
be an objective provided it serves some useful purpose. In the 
same manner, relaxation or cessation of all activity, or certair 
definite and specific adjustments, may serve as objectives. Ir 
fact, from the standpoint of measurement no objective, how 
ever fantastic or far-fetched, would need to be excluded. Th( 
only requirement which measurement makes of the champior 
of any objective is that he be specific in regard to its nature anc 
the evidence of its existence. Measurement can then reveal th< 
extent to which such evidence is absent or present, but th< 
burden of proof is on the champion of alleged outcomes unti 
he becomes specific and definite. 

There is constant confusion regarding the nature and relativ 
values of different types of objectives. Only a few can be con 
sidered. The teacher who regards "fact-and-skill ” teaching 
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as bad may be so vague in her direction of learning that little 
actual learning occurs. Indefiniteness does not guarantee 
“higher” outcomes. Furthermore, pupils are noticeably weak 
in certain facts and skills which are essential to normal growth 
and progress. Again, certain teachers believe that learning by 
doing requires constant physical activity. The terminology is 
indefinite and unfortunate because it specifies neither what 
learning nor what doing. Activity primarily mental in nature 
is doing as truly as physical activity, and may be much more 
significant. What is usually indicated is that the activity 
should have useful, understandable, purposeful objectives 
which partake of the nature of reality, and that students should 
attack it in terms of the normal activities of adults under the 
same circumstances. Finally, there are those who insist that 
all education should make changes in conduct. In reality, it is 
impossible to learn without making some changes in conduct. 

The difficulty again arises from the vagueness of the term 
‘ ‘ conduct. ’ ’ Learning to read and to use numbers changes con¬ 
duct; learning to attack problems also changes it. If conduct 
indicates the ability to make desirable adjustments or to 
achieve, the facts still remain the same; certain relatively simple 
objectives are vital to all significant conduct. The greatest 
need, to avoid confusion, is specificity of objectives and termi¬ 
nology. Measurement may then supply evidences of alleged 
outcomes and the selection and organization of objectives may 
I proceed with some regard for the facts of a given situation. It 
is helpful to know that objectives include facts, information, 
appreciations and understandings, attitudes and ideals, and 
skills, habits, and desirable behavior. But to be significant 
for curriculum-building these categories must be analyzed into 
the specific manifestations of each that should become the goals 
of educational activity. 

Organizing the course of study. Ideally, the course of study 
should be an organization of objectives, rather than of materials 
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as it is at present. Materials and procedures may be indicated 
or suggested, but should be entirely dependent on objectives 
set up. Such a course may consist entirely of a number of 
related units in certain fields where the major aims or purposes 
justify it, or a combination of more or less comprehensive 
units with the addition of some lesser topics. A whole subject 
might be organized in this manner, with each unit or larger 
division consisting of a major aim or aims and related important 
specific objectives. Some subjects as such must be retained. 
Such an organization of objectives would make possible an 
intelligent selection of materials, more adequate direction of 
learning, and surer diagnosis and appraisal. While essential in 
the selection or production of materials and equipment, such an 
organization would in no way dictate procedures on any level. 
The elementary teacher might still be traditional and attempt 
to attain each objective by direct attack; she might utilize 
major aims and purposes and enriched pupil experience, but 
guide pupil activities in the light of these important specific 
objectives; or she might put her faith in the miracle-working 
powers of a planless curriculum, accept objectives, ignore them, 
and never know where she was nor how she arrived. The sec¬ 
ondary and college teacher could still teach subject-matter 
facts, or important outcomes including accurate knowledges 
and information. To the extent that objectives are definite 
and specific, good tests are possible and measurement can reveal 
their attainment or lack of it, and serve as the basis of modi¬ 
fications. 

The planless curriculum with its mania for avoiding definite¬ 
ness has almost run its course. Those who have defended it in 
the effort to avoid a fixed or unchanging curriculum have con¬ 
fused fixity with definiteness. A definite course of study in 
terms of objectives may be changed as often as desirable; but, 
while in force, its effectiveness may be appraised through ob¬ 
servation and testing for outcomes, and modifications may be 
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based on actual needs. Definiteness of objectives in a course 
of study does not dictate specific subject-matter, such as a par¬ 
ticular textbook, or supplementary materials; it merely makes 
selection of subject-matter more adequate by its definiteness. 
Neither does definiteness imply completeness, because no course 
of study can possibly contain all desirable objectives. The 
course of study should be an organization of all important 
objectives so far as they are known, together with suggested 
materials end activities known or believed to be effective; ob¬ 
jectives may be added, dropped, modified, or placed on different 
levels when desirable, and materials and activities should be 
added, dropped, or modified according to the effectiveness 
revealed by careful recorded observation and testing. The 
planned curriculum merely asks that the teacher have as defi¬ 
nite an idea as possible of what she expects to do in the light of 
the assumed desirable characteristics of a good citizen, and of 
available knowledge of children and youth at succeeding peri¬ 
ods of development. The peculiarities of the individual normal 
student may require modifications in procedures and activities, 
but not in desirable objectives to any significant degree or in 
any significant numbers. 

The functionality of the curriculum is the extent to which it 
aids students in attaining the objectives of that curriculum. 
Functionality cannot be determined from mere inspection or 
argument, or even agreement; it can be determined only 
through careful, systematic recorded observation in terms of 
criteria or standards which represent evidences of desirable 
objectives, and through careful testing of attained objectives. 
The degree of functionality of a curriculum depends not only 
upon the course of study, but upon materials, activities, and 
procedures as well. 

Functionality, therefore, is not guaranteed by any one of the 
above factors, or through others such as the avoiding-breaks-in- 
pupil-experience compulsion, or the pupil-felt-need complex. 
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Breaks in significant experience can be avoided more surely 
through planning than through drifting along on the uncertain, 
whimsical, and often accidental nature of the planless or oppor¬ 
tunistic curriculum. Carefully planned objectives, helpfully 
chosen materials, and procedures psychologically correct still 
allow much freedom for teachers who object to an established 
and more or less validated curriculum. 

The felt needs of pupils are important considerations in cur¬ 
riculum-making, but they have been productive of much con¬ 
fusion. Felt needs should not be regarded as momentary 
whims or sudden urges; these are mere impulses, uncertain 
and diverse, and not safe guides to any plan. Felt needs are 
those which through observation and testing over a period of 
time emerge as more or less constant. This principle does not 
imply, as it has so often been interpreted, that pupils can feel 
or discover all of their needs; felt needs of pupils constitute a 
guide to a part of the curriculum, but “unfelt” needs which 
only the experiences and researches of adults can supply con¬ 
stitute a larger part of the desirable objectives for children and 
youth. The attainment by pupils of the characteristics of good 
citizens, for example, does not occur automatically because the 
teacher uses the unit or activity or any other plan, even though 
some plans appear better than others. The teacher should 
know these characteristics, and guide learning activities toward 
these objectives. Furthermore, felt needs, even the most gen¬ 
uine, are not fundamental like the desire for food or sleep. They 
are of many degrees of intensity, and a wise teacher plans many 
activities and situations which transform many unfelt ob¬ 
jectives into felt needs. Other teachers, just as wise appar¬ 
ently, motivate the attainment of objectives equally well by 
explanations and the identification of pupil interests with the 
objectives sought. Pupil needs are of many types, including 
certain physical, emotional, and educational objectives which 
can be determined only by testing and measuring and by reli- 
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able plans of systematic observation. Teachers can and have 
guessed, but the variability of their results proves the futility 
of this method. Future courses of study must be based upon 
accumulating knowledge of children and youth in general, and 
increasing specificity in the objectives we wish them to attain. 
The direction of learning activities in the light of this knowledge 
and these objectives must be improved by improving the abil¬ 
ity of teachers to test, measure, and otherwise determine 
quickly and reliably the personal idiosyncrasies and character¬ 
istics of individual pupils in order better to adjust means to 
ends. 

Subjects still must remain in the course of study. They are 
not “watertight compartments of knowledge,” but important 
organizations of knowledge. Not subjects and subject-matter, 
but their misuse, is so undesirable. Certain skill subjects, such 
as reading, writing, and arithmetic, will probably remain be¬ 
cause they are useful in attaining certain types of outcomes, no 
matter how we may disguise them. Other subjects, such as 
history and geography, may merge entirely into social-studies 
units which utilize their content to attain other types of out¬ 
comes without necessarily sacrificing related functional knowl¬ 
edge and information. Subjects are very useful in connection 
with even the unit and activity plan, since they enable learners 
to locate many types of information in the shortest time. Real 
units may exist within subject-matter fields; the fact that the 
attainment of important general aims or purposes involves 
crossing subject-matter lines is not of itself important. This 
has always been true to a certain extent in the “topics” and 
special reports of the traditional school. The important con¬ 
sideration is the fact that if a proper organization of objectives 
exists, units, subjects, and more or less independent topics may 
all be utilized to advantage; and measurement will reveal the 
effectiveness of each of them in attaining such objectives. 

Thus, no matter whether a subject, unit, or topic curriculum. 
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or any combination of these, is utilized, objectives should be 
definite enough to be used as the basis of observation and test¬ 
ing. When certain objectives are not being attained measure¬ 
ment will reveal it. A study of the situation should attempt to 
determine whether these objectives are too difficult, whether 
materials are inadequate, or whether procedures should be 
modified. If course-of-study changes are made, measurement 
will later reveal the extent to which such changes have im¬ 
proved the situation, and whether or not additional changes are 
desirable. 

Selection and elimination of materials. Theoretically, ma¬ 
terials should be selected in relation to the objectives of the 
curriculum as expressed in the course of study, such objectives 
presumably having been selected and organized with due 
regard for the nature of children and youth at different periods 
of development. The characteristics which we wish the 
learner to possess at succeeding points in development should 
also be considered. Practically, materials have been selected 
in a far different manner. 

Most schools and school systems have had some type of 
course of study; among them are mere designations of materi- 
als-to-be-covered, vague statements of objectives with some 
materials-to-be-covered, rather full statements of more or less 
disjointed fragmentary objectives, and attempts to organize 
objectives by the armchair, committee, or “expert” method. 
In the selection of textbooks and other materials, however, 
such courses usually were largely forgotten; instead, the school 
committee, the textbook committee of the board of education, 
the superintendent, supervisors and principals, and finally 
teachers successively and in various combinations took respon¬ 
sibility. Even the first attempts to use score cards in improv¬ 
ing selection were largely independent of the stated objectives, 
if any, of the curriculum. 

Improved methods were welcomed by publishers since they 



298 DIAGNOSIS AND THE DIRECTION OF LEARNING 


were under the necessity of caring for a wide variety of local 
viewpoints with a relatively limited number of publications. 
The first attempts broke down the general problem into such 
factors as the following: ^ 

1. Topical content. 

2. Illustrations. 

3. Presentations, discussions, and methods. 

4. Drills, tests, summaries, and reviews. 

5. Vocabulary. 

6. Size of numbers. 

7. Mechanical make-up. 

Different available textbooks were studied in relation to 
these topics, and much detailed and relatively more specific 
information was obtained than by the subjective general- 
impressions methods previously used. The percentage devoted 
to various desirable topics, drills, reviews, and the like for each 
available book or series could be determined, and vocabulary 
and number difficulties could be revealed. 

A number of communities have established criteria for me¬ 
chanical make-up which their textbooks must meet. For 
example, the New York City Board of Superintendents have 
established the following tentative standards: 

I. Binding 

Books should open flat; otherwise muscular fatigue results and 
rapid changes in reading distance cause eye fatigue. 

Strength of binding must not be ignored. The binding should be 
reinforced so that the book is held firm in the cover. The cover 
boards should be firm. 

The binding should be generally appropriate to the character of 
the use without unnecessary expense in the way of decoration or 
detail of manufacture. 

The side-thread (Singer) stitch should be allowed in grades for 
children up to nine years, provided there is sufficient inside margin 
to permit free opening. 

* See Fuller, Florence, Scientific Evaluation of Textbooks, Boston, Houghton Mifflin Com¬ 
pany, ipaS; Connor, W. L., Selecting Textbooks and Maps, Part I, Cleveland, Harter Supply 
Company, 1927; and similar publications. 
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In books for children from nine to twelve years old and for adults 
the side-thread stitch should be avoided. The books should be 
sewed so that they open flat. Music books should be sewed, not 
stitched. 

2. Paper 

Aside from considerations of durability, paper should be smooth, 
without gloss, hard-finished, opaque, but not too thin. Print marks 
should not show through. Paper should be white or with a very 
slight cream tone. 

3. Margin 

Fore edge should be not less than one-half inch; back edge not 
less than three-eighths inch; top edge not less than three-eighths 
inch; lower edge not less than five-eighths inch. 

4. Illustrations 

Illustrations should be both artistic and suitable in type for the 
grade. Line blocks of good size are preferred. The principle of 
color harmony should be adhered to. Colors should not conflict 
with reality. Illustrations should not be too small for clear and 
easy perception. Illustrations should not contain too many irrele¬ 
vant details. Highly glazed paper necessary to exhibit photo¬ 
graphic detail is permissible for older children. 

5. Length of line 

The length of lines should not be more than lOO mm. — 4 inches; 
minimum standard 3 inches. Short lines inserted between illus¬ 
tration and margin are not desirable. 

6. Number of lines {leading) 

A vertical space of 4 inches should not contain more than the 
following: 


Age Number of Lines 

Under 7 years. 10 

7 to 9 years. 20 

7- Space between words 
Under 7 years at least 3 mm. 

8. Space between letters 

Under 7 years.75 mm. 

9. Height of small letters 
Under 7 years. ... 2.6 mm. 

7 to 9 years. 2.0 mm. 


Age Number of Lines 


9 to 12 years. 22 

Above 12 years. 24 


Adult.at least 2 mm. 

Above 8 years. 

more than .50 mm. 


9 to 12 years. 1.6 mm. 

Adult standard ... 1.5 mm. 
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10. Width of vertical stroke 

Under 7 years. 

not less than .24 mm. 

11. Space within letters 

Under 7 years.8 mm. 

7 to 9 years.7 mm. 

12. Character of type 

The type should be clear and simple rather than ornate and 
fancy; clean cut, and well defined, with as little difference as pos¬ 
sible between hair strokes and thick strokes. 


Adult standard. 

not less than .25 mm. 

9 to 12 years.6 mm. 

Adult.5 mm. 


However, even these improvements have not solved the 
problem. The amount of content devoted to important topics 
does not guarantee that such content will contribute to the 
particular objectives deemed important; illustrations differ in 
significance, timeliness, and value, as well as in space relations; 
words are not units, with single difficulties, but may have as 
many difficulties as uses; and interest appeal to children and 
youth cannot be judged by mere study of the materials. 

A number of factors will aid in the solution of the problem. 
In the first place, publishers are becoming increasingly exacting 
in their requirements for the acceptance of manuscripts; care¬ 
ful studies of objectives are made and checked; vocabulary 
and interest studies are conducted; and illustrations, drills, and 
the like are subjected to the closest scrutiny. In many in¬ 
stances the competing publications of several publishers have 
been prepared so carefully that all will serve a given curricu¬ 
lum. 

In the second place, as general aims and purposes of educa¬ 
tion and the important specific objectives implied in them are 
more carefully determined, authors of textbooks and other 
materials utilize them. Eventually, this procedure alone will 
solve several problems. 

Methods of selection are also becoming more scientific. Re¬ 
search has shown, for example, that interest appeal cannot be 
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judged or estimated any more reliably than difficulty or merit 
by mere general-impression methods. 

Table 29 presents the estimates of a group of primary teach¬ 
ers of the interest appeal to their pupils of five primers, and 
compares these estimates with actual pupil interest as revealed 
by the extent to which these children read the primers when 
allowed to choose from a sufficient number of copies. The 
teachers examined the primers and rated them, and a com¬ 
posite of ratings appears under “Teachers “ in the table. The 
Interest of children was determined by the frequency with 
which they chose the various primers. It is significant to 
note that while a measure of agreement existed, the primer 
adjudged least interesting by the teachers was chosen oftenest 
by pupils for free reading in this particular study. 


Table 29. Interest Appeal of Five Primers 


Primers 

Interest Appeae as Judged by 

Teachers 

Children 

A 

4 

5 

B 

I 

z 

C 

3 

4 

D 

2 

3 

E 

5 

I 


Content is being examined not only on the basis of space 
involved, but also to discover its relation to specific objectives 
of a particular curriculum. Two or three different types of 
materials are utilized with as many similar groups to aid in 
attaining identical objectives, and the results obtained are 
measured and compared. The difficulties of materials are be¬ 
ing tested in the same manner and by special formulas for de¬ 
termining grade placement. 

The Winnetka Chart ^ was developed to aid in determining 
grade placement of recreational materials, and requires about 

' Washbume, Carleton. Winnetka Chart for Determining Grade Placement of Children’s 
Books- Winnetka Educational Press, Horace Mann School, 1934. 
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six hours per book. The Lewerenz * formula has been devel¬ 
oped to aid in determining difficulty, diversity, interest appeal, 
and other aspects of adequacy. Recent work of Lewerenz not 
yet published has extended and improved the usefulness of his 
contribution. Care must of course be observed in the admin¬ 
istration of any device for determining difficulty which depends 
on samplings of vocabulary. Words may have not one but 
several difficulty values when they have more than one mean¬ 
ing, and when used with the same meaning may present differ¬ 
ent difficulties in different constructions. 

Illustrations may be subjected to actual try-outs with pupils 
to determine to what extent they are illustrative. Teaching 
aids may be studied from the standpoint of teacher utilization 
and objective evidence of pupil achievement. 

Overlapping curriculum content may be determined from in¬ 
spection and analysis, but its desirability can be determined 
only by studying the extent to which the objectives involved 
are being attained. Bush “ investigated the extent of over¬ 
lapping in the case of one hundred college students in Southern 
California, with the following results: 

1. All lOO averaged six weeks of repetition in English. 

2. Thirty-three repeated a minimum of six weeks in Algebra. 

3. Thirty-six repeated a semester in Modern History. 

4. Eighteen repeated a year’s work in Chemistry. 

5. Twenty-six repeated a year’s work in a foreign language. 

Much of this overlapping occurs as the result of adminis¬ 
trative problems, but its effect can be determined only through 
careful testing. When valid tests reveal that the high school, 
as well as the college or university, can aid students in attaining 
certain objectives, the students should be excused from repeti- 

' Lewerenz, Alfred S. Vocabulary Grade Placement Formula. Published by the Research 
and Guidance Section of the Los Angeles City School District, 1931. See also Clark, Willis. 
A SiielUng Dictionary for Elementary School Pu-pils. Los Angeles City Schools, 1928. 

* Bush, Ralph H. "Overlapping of High School and College Courses”; in California 
Quarterly of Secondary Education, vol. s. (June. 1930.) 
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tion, but valid standardized tests must be used on both levels 
before comparable data can be obtained. 

Elimination of undesirable content should be determined in 
the same manner; mere inspection is not a safe guide. Lack 
of functionality in attained objectives, as revealed by valid 
tests, should be substituted. 

Time allotments. Time allotments for units, topics, or sub¬ 
jects may also be determined more adequately. The fact that 
one teacher uses four or six weeks for a unit whereas another 
consumes a whole semester does not guarantee any greater out¬ 
comes for the latter. Actually, this time allotment guarantees 
nothing for either teacher unless these units are guided with the 
distinct purpose of attaining certain definite and important 
objectives. Individuals differ considerably in their rate of 
learning, but such differences are provided for in procedure and 
activity-adjustments and through special aid and attention of 
the teacher. Normal class learning rates are relatively stable. 
Different units may be utilized to obtain the same objectives, 
or objectives which are just as important. But objectives to 
be attained should be indicated in a carefully developed course 
of study and should not depend upon the accidental nature of 
the unit which has happened to be selected. Some teachers 
spend a semester on a literary masterpiece, while other teachers 
devote two or three weeks to it; unless specific important 
objectives are determined beforehand, and tests reflecting 
these objectives follow, no one can possibly show whether either 
has succeeded. Time allotment must not depend upon the 
accidental nature of procedures used or the personal idiosyn¬ 
crasies of teachers. Time-allotment determination is prima¬ 
rily a matter of objectives. Whether dealing with subjects, 
units, or topics, regardless of the nature of the alleged out¬ 
comes, teachers must know more specifically what they are 
trying to do and must learn to test for obtained outcomes. 
There is no good reason for the present vagueness and wide 
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variation in time allotments with their probable waste. The 
amount of time necessary for normal classes to attain impor¬ 
tant objectives can be determined with a large measure of 
exactness. Justifiable deviations may then be allowed where 
special conditions warrant. 

Teaching-learning procedures. Most discussions of teaching 
procedures and learning activities are literary and philosophical 
rather than scientific, with due regard for facts. We consider 
almost all factors except the actual results obtained by our pre¬ 
ferred plans. Many excellent college and university research 
contributions, fragmentary though they must generally be, are 
unread or misinterpreted and have little effect on procedures 
actually used in the schools. Teachers are confused by the 
multiplicity of plans which conflicting leadership urges upon 
them, and their limited training in diagnosis and appraisal¬ 
testing has made it impossible for them to assimilate and utilize 
contributions actually made. Furthermore, some otherwise 
normal workers actually argue that we must ignore entirely 
the attempt to utilize results in determining the effectiveness 
of procedures. 

From the practical standpoint we must at present be satisfied 
to use many procedures the values of which are unknown, since 
we cannot suspend the direction of learning activities until we 
learn these values. However, we should be aware of the tenta¬ 
tive nature of many procedures, and the desirability of validat¬ 
ing them as soon as possible. Furthermore, it is desirable that 
new, strange, and far-reaching proposed changes should be 
validated before thousands of teachers plunge into largely un¬ 
guided experimentation, to the detriment of hundreds of thou¬ 
sands of children and youth. There is no longer any excuse for 
rile meaningless, amateurish, unvalidated orgy of innovations 
which have masqueraded during the past decade as the tech¬ 
niques of presumably professionally trained workers. One of 
our basic difficulties is the conviction of practical school men 
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that nothing of importance can be learned in teacher-training 
institutions, and that teachers must be retrained when they 
enter the profession. Workers in the field are too prone to put 
their faith in a procession of philosophical platitudes of uncer¬ 
tain meaning and trust their own oft-demonstrated unreliable 
general impressions, instead of learning how to diagnose diffi¬ 
culties and appraise results. Fundamentally, procedures are 
effective to the extent that they aid in translating objectives 
into outcomes; there can be no significant knowledge about 
procedures except in terms of outcomes. Regardless of the 
nature of conceived or alleged outcomes, no amount of argu¬ 
ment, however brilliant or pleasant, can compensate for igno¬ 
rance of actual outcomes. Only a young and reckless democ¬ 
racy could pour out its treasures so lavishly for teachers, build¬ 
ings, materials, and equipment, and wait so long before requir¬ 
ing any real evidence of outcomes. The surest guarantee of a 
continuance of this generosity and the proper education of 
succeeding generations of youth is the use of procedures vali¬ 
dated on the basis of outcomes. 

Measurement may be used to determine the relative merits 
of different teaching procedures, regardless of the level of learn¬ 
ing or the type of procedure. All that is required is that the 
procedures be clearly indicated, and that they be made the only 
variable in the study so far as that is possible. Two methods 
are available, the control group and the rotation techniques. 

In the control-group method of evaluating procedures at 
least two groups of pupils as similar as possible are necessary. 
Ideally there should be in the first group an identical pupil for 
each pupil in the second group. Actually, groups have been 
considered similar when the. means and standard deviations ^ 
have been approximately equal in the factors considered, such 
as I.Q., age, sex, and achievement. The learning activities of 
the two groups are then directed by the two procedures the 
relative merits of which are to be determined. The groups are 

* See Cbaptefs Fourteen and Fifteen. 
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then, tested witli the same carefully validated test, and the 
results compared. Care must be observed in computing the 
reliability of the obtained differences.* The major steps in the 
control-group method may be summarized as follows: 

1. Tests of I.Q-, achievement, and the like, to determine equivalence. 
If sufficient pupils are available, match as closely as possible; if 
not, eliminate pupils until means and standard deviations of their 
records are approximately equal. 

2. Make a definite test of the objectives to be attained. 

3. Make clear-cut statements of just what activities will be used in 
each procedure, and follow these plans carefully. 

4. Keep all factors except teaching procedures as constant as pos¬ 
sible. 

5. Test each group with carefully devised tests, the items of which 
constitute the objectives to be attained. 

6. Compute the reliability of the difference, if any, of the obtained 
results. 

Zyve * compared a number of methods of teaching spelling, 
with the results indicated in Table 30. 

Unless differences obtained reveal chances of at least 278 to 
1, they are not considered significant; or, to put it another way, 
the difference between the mean results of two methods must 
be at least three times the standard error ^ of such difference. 
Zyve's study, therefore, showed no significant statistical differ¬ 
ences between the methods compared. 

Because of the necessity and difficulties of equating groups 
with the control-group method, the rotation technique has been 
widely used- With this method any number of groups, regard¬ 
less of their ability or previous achievement, may be utilized. 
Learning activities of each group are directed succe^ively by 
the procedures to be evaluated, and the gains made through 
each procedure are accumulated and compared as before. Thiy 

' See Chapters Fourteen and Fiftera. 

* Zyve, C. T. An Experimcntnl Study of SpetUng Methods. Teachers College Contribu¬ 
tions to Education, no. 466. New York. Columbia University, tp^t. 

» See Chapters Fourte<in and Fifteen, 
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Factors Compared 


Teacher-directed vs. individual 
study; use of lists and sen¬ 
tences constant 

Lists only vs. list-context meth¬ 
od; individual and directed 
study constant 


Factors Favored 


Teacher-directed 

study 


List-context 

method 


Approximate | 

Chances I 


20 to I 


18 to I 


Thirty minutes weekly added 
home-work review vs. no 
home-work review; review 
methods mixed 


Home-work review 


5 to I 


Thirty minutes weekly added 
home-work review vs, no 
home-work review; list-con¬ 
text method; teacher-directed 
study constant 


No home study 


2 to I 


Teacher-directed, list-context 
method vs. individual study, 
words in lists 


Teacher-directed, 

list-context 

method 


155 to I 


procedure is illustrated in Table 31, where the respective merits 
of two procedures, designated as the “direct” and “textbook” 
methods of teaching French, are compared. 

Table 31 reveals that two groups of pupils were used, and 
that each was taught alternately by the direct and the textbook 

Table 31. Tabulation of Vocabulary Scores for Groups 
Taught by Two Methods, Direct and Textbook 


Period 

Gains in Score 

Group A 

Group B 

(I) 

(s) 

( 3 ) 

I 

(D) .4 

CT) 1.7 

2 

(T) .5 

CD) 1.5 

3 

(D) 2.r 

(T) .8 

4 

(T) 1.4 

CD) 3.1 
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method. The study ran for four experimental periods, and 
each group was taught twice by each method. Before the first 
period, and at the end of each of the four periods, each group 
was tested with the American Council French Vocabulary 
Test. The total of the gains roistered by the direct methods 
was 7-1, and by the textbook method 4.4, making a difference 
of 2.7. Since the standard error of the difference ^ was i .8, and 
the obtained difference of 2.7 is not at least three times this 
quantity, the difference between these procedures cannot be 
considered significant. 

The principal steps in the rotation technique may be sum¬ 
marized as follows: 

1. Make a definite list of the objectives to be attained. 

2. Make clear-cut statements of the activities to be used in the pro¬ 
cedures to be compared. 

3. Test each group before the first experimental period, using a test 
with sufficient range to test expected gains. 

4. Direct the learning activities of each group successively by the 
various procedures to be compared. 

5. Test each group for gains after each period, using the same test 
firet used. 

6. Add the gains from each method at the end of the study. 

7. Compute the significance of the differences in gains made through 
the use of different procedures. 

Several difficulties develop in using the control-group and 
rotation techniques- First of all, procedures to be compared 
often involve many overlapping activities. In laboratory- 
versus textbook-method studies, for example, many pupils in 
the laboratory group read textbooks, and many laboratory 
experiments are described in textbooks themselves and read by 
the textbook group. The result is that many of these studies 
show no significant differences between procedures because of 
duplicate learning activities. On the other hand, the attempt 
to avoid all duplication of activities has sometimes resulted in 

»See Chapters Fourteep and FUtceo. 
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procedures so artificial that they would not be used except in 
such a study. 

In the second place, the fact that one procedure may appear 
superior to another proves nothing about the merits of either; 
both may be good or bad. However, new ideas and procedures 
may periodically be compared with those in use, and the better 
retained until displaced. 

The major difficulty with procedure studies is probably the 
undeserved authenticity and effectiveness with which they 
appear to stamp certain methods. The fact that a method has 
been investigated should not per se give it added standing. 
Most of the methods studied so far have been traditional in 
nature, and the results obtained would probably not be valid in 
terms of newer concepts of the nature and function of subject- 
matter. Newer procedure studies are needed in terms of the 
learning activities actually utilized by large groups of teachers 
in service. It is possible that with the prevalent vagueness of 
objectives, traditional and progressive procedures would reveal 
no significant differences in outcomes. 

Fusion, integration, and correlation. Such terms as “fu¬ 
sion,” “integration,” and “correlation” are pure subject- 
matter concepts as used by most would-be progressives, who 
are still thinking largely in terms of new manipulations of 
subject-matter. Combining social studies and English into 
one period (often one-half given to each), or music and art in a 
similar manner, is futile. If subject-matter is fused or person¬ 
alities are better integrated through reorganization of curricula, 
it must come through a more specific determination of aims 
and purposes. All types of subject-matter from all fields may 
then be drawn upon as needed. 

There is much need of validation in connection with the 
unit-and-activity curriculum. Many units are merely rechris¬ 
tened units of subject-matter. Activities are as stereotyped 
and traditional as any subject-matter curriculum ever pro- 
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duced. Most so-called "units” take something of the fol¬ 
lowing form: 

1. General aim or aims: Usually present. 

Examples: 

1. How Germany became a great commercial nation. 

2 . Transjxjrtation. 

3. The citrus industry in California. 

2. Specific objectives: Usually alisent. 

Facts, information, understanding, appreciations, skills, habits, 
and the like, through w'hich the general purposes, generaliza¬ 
tions, ideals, and attitudes will l>e attained, should be provided. 
Examples: Usually missing. Often several restatements of 
general aims appear here. 

3. Subject-matter: Usually indicated. 

Difficulties: 

1. Often too difficult for .students to read. 

2. I.acks interest. 

3. Topics corresixind with general aims, but specific con¬ 
tent unsulted. 

4. Too much. Pupils cannot cover enough material to get 
a rich iinder.standing of major aims. 

4. Activities: F'sually selected. 

Examples: 

1. Reading (research investigation). 

2. Rejjorting to class. 

3. Visits to stores, factories, hanks, and so forth. 

4. Making models. 

5. Making map.s. 

6. Making collections. 

7. Drawing pictures. 

5. Approach: Usually present. 

Difficulties: 

1. Time often wasted in indirect attack. 

2. Purpose not understood; interests of students not iden¬ 
tified wnth major aims and activities. 

6. Guidance plan: Usually absent. 

Desiralile features: 

1. A pre-test of objectives, 

2, Tentative time plan for whole unit. 
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3. Tentative time plan, for pursuing and reporting each, 
activity. 

4. Diagnostic tests. 

5. Culminating activities. 

7. Unit-test; Absent. 

Nature: 

A well-balanced sampling of all important outcomes which 

the unit was intended to aid students in attaining, 

A basic need in connection with independent units is the de¬ 
termination of specific objectives, and their organization 
around the general aims and purposes which they aid in attain¬ 
ing. Subject-matter cannot be adequately chosen without 
such knowledge. But the most important function of measure¬ 
ment is the determination of the extent to which objectives are 
translated into outcomes. For this purpose, pre-tests, diag¬ 
nostic tests, and unit-tests should be used. 

For example, it is pure waste to allow students to spend 
time on objectives already attained. If, therefore, teachers 
know specifically enough the objectives of a given activity, 
they may give a pre-test on these objectives and guide educa¬ 
tional activity toward the attainment of unattained objectives. 
When difficulties occur, diagnostic tests will reveal them. 
Some activities are probably very useful in producing desirable 
outcomes; others may have little value. Measurement will 
reveal the facts. It is important to find out what activities 
produce what outcomes. If ideals, attitudes, loyalties, and 
other generalizations made famous by the leaders of the pro¬ 
gressive movement are objectives, we must specify exactly 
what ideals, attitudes, and loyalties they are, and we must 
determine through what specific objectives these generaliza¬ 
tions are attained. They do not just blossom mysteriously 
because of some newly conceived and propagated procedure. 
To give them directly as indoctrinations violates the very basic 
tenet of progressivism, and constitutes, from this viewpoint, a 
most flagrant example of fact-teaching. 
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as applied to education is simply a statement of how traits, 
qualities, ability, and achievement are expected, theoretically, 
to be distributed. Such normal distributions are never ob¬ 
tained except by chance, but the distributions actually ob¬ 
tained usually resemble the theoretically normal distribution 
more than any other. By utilizing the mathematical proper¬ 
ties of the normal distribution it is possible, among other 
things, to determine the difficulty of test items and to scale tests 
and develop product scales; to determine the extent to which a 
particular group of scores represents a normal distribution and 
may therefore serve as the basis for drawing valid conclusions; 
to make deductions concerning the discrimination of a test 
when it has been applied to a presumably unselected group; to 
determine the reliability of means and differences between 
means and other types of statistical computations; and to 
transmute the scores of tests with differing numbers of test 
items into more comparable units. 

Figure 63 represents a normal distribution of I.Q.’s when the 
mean is 100. 

Figure 63 illustrates the mathematical properties of the 
normal distribution in terms of whole standard deviations. 
The (T of a normal distribution of I.Q.’s with a mean of 100 is 
about i 61 .Q. points. Figure 63 reveals that after 3 have 



FIG, 63. NORMAL DISTRIBUTION OF SCORES (l.Q.’s) 
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8. Explain how measurement may aid in the improvement of teaching. 
Why are specific objectives important in improving procedures? 

9. Indicate difficulties experienced and the limitations of results ob¬ 
tained in using the control-group and rotation techniques for deter¬ 
mining the relative merits of procedures. 

10. Explain in detail how measurement may contribute to the valida¬ 
tion and usefulness of the unit-and-activity curriculum. 
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. CHAPTER THIRTEEN ■ 


DIAGNOSIS OF TEACHING 
DIFFICULTIES 


In the past, most devices for the evaluation of teaching and 
teachers have been used as a basis for the selection, promotion, 
demotion, or dismissal of teachers, and for the modification of 
salaries. The most often used device has been some form of 
rating plan, but the pressure of other duties has often made 
such a plan a mere formality, to the extent that some school 
officials rate teachers without even visiting their classrooms. 
The major objective of teacher and teaching evaluation should 
be the diagnosis of teaching difficulties and the improvement of 
teaching and learning. If this objective can be successfully 
accomplished the same results may be used to serve other pur¬ 
poses as well. 

Clarification of point of view. Teaching and teacher difficul¬ 
ties are basically different problems, and failure to distinguish 
between them has been a cause of confusion. In this book 
major consideration will be given to the former. Teacher 
selection will receive attention only as it is related to the basic 
problem of evaluation, and the plans themselves will be pre¬ 
sented and evaluated. 

Teacher objections to traditional evaluation. Many teach¬ 
ers have objected to the rating plans in current use, because of 
their general unreliability. In a conclusively large per cent of 
the plans studied, except those which reveal a more or less 
frozen subjectivity, reliability is so low as to disqualify them 
as instruments of decision or policy. In the second place, in 
spite of much discussion about improvement of teaching, 
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teachers report that ratings have had little if anything to do 
with improvement. They point out that rating officers fre¬ 
quently rate them with little or no inspection or knowledge of 
their work, that ratings are often secret, and that often they 
cannot obtain their own ratings. Furthermore, conferences 
on the ratings are not held, or, when held, are on considerations 
other than teaching objectives, procedures, or outcomes. Fre¬ 
quently when teachers ask for aid, rating officers are unable to 
give it, or suggest procedures which have been condemned by 
teacher-training institutions and school officials. Other teach¬ 
ers report that certain rating officers have peculiarities, singling 
out, for example, one element of the total teaching situation, 
and judging entirely on this detail. Other rating officers toler¬ 
ate but one unvalidated and uncertain method in a given situa¬ 
tion, and rate low regardless of the apparent excellence of 
results with a different procedure. Teachers point out, fur¬ 
thermore, that there is no such thing as a criterion or standard 
of teaching success and that rating officers often tacitly admit 
this fact. Finally, rating systems are used to support dismis¬ 
sals which have previously been determined on other grounds. 

Score cards, rating scales, and check lists. Many devices 
have been developed to improve evaluation of teaching and 
teachers. At the beginning of the century Boyce reported a 
high relationship between forty-five traits and teaching effi¬ 
ciency. His score card served as the prototype of a long line of 
successors which appeared in many variations as time went on. 
Some of these types will be presented. 

The simplest form of score card consists of a number of traits 
qualities, or aWlities, and a provision for indicating reactions, 
as illustrated in Figure 57- 

Variations of this score card were developed in the form of 
rating plans and scales. The qualities and traits remained 
much the same, but different types appeared for indicating the 
score or making the rating. 
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Qualities 

Score (o to loo) 

1. Professional qualities 

2. Social qualities 

3. Co-operative qualities 

4. Leadership 

5. Professional attitude 

6. Scientific attitude 

7. Teaching ability 



FIG. 57. SIMPLE SCORE CARD 


Figure 58 illustrates some of these variations. 

In the first* or general, type the rater simply checks the ap¬ 
propriate columns for fail, poor, average, good, or excellent. 
In the second type the rater does not attempt five-point dis¬ 
tinctions, but indicates whether the individual’s rating is satis¬ 
factory or unsatisfactory. In the third, or per cent, type, the 
rater utilizes the traditional teacher-marking technique, rang¬ 
ing from o to 100. In the weighted type, an attempt has been 
made to differentiate between qualities, traits, or character¬ 
istics in relation to their assumed importance in teaching suc¬ 
cess- In the last type, the rater describes in his own language 
his judgments of the traits and qualities rated. 

For reasons to be discussed later these forms of rating proved 
generally unreliable and unsatisfactory, and other devices were 
tried. Figure 59 illustrates the so-called man-to-man rating 
scale in which individuals, rather than ratings of poor, average, 
and superior, were used to indicate variations in the qualities 
rated. 

In rating a particular teacher in teaching skill the rater 
simply recalls the most and the least skillful teachers he has 
ever known, and then fills in three others of appropriate skill. 
Rating consists of checking the name of the teacher whom his 
subject most nearly resembles. This device also possessed seri¬ 
ous limitations, and still other plans were developed. 

The Department of Class Room Teachers of the National 
Education Association devised special scales for rating both 
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Degrees of 
Teaching Skill 

r - ■ —. . ■ 

Scale 

Teacher 

Check One 

Most skillful. 

E.Jones 


Good in skill. 

J. Doe 

The rater checks the 
teacher whom his 

Average skill. 

P. Williams 

subject most nearly 
resembles in skill 

Poor in. skill..... 

W. O’Conner 


Least skillful..... 

T. White 



FIG. 59. MAN-TO-MAN RATING OF TEACHING SKILL 

drill techniques and problem work. Figure 60 presents the 
plan for drill technique in abbreviated form. 


0 

20 

40 

60 

80 

100 

Useless 

material 

Well-chosen 

material 

Well-clioacn 

material 

Well-chosen 

material 

Well-chosen 

material 

Well-chosen 

material 

No 

interest 


Thorough 

motivation 

Thorough 

motivation 

Thorough 

motivation 

Thorough 

motivation 

Poor 

techniciue 



Good 

tcchniciue 

Good 

technioue 

Good 

technique 

Indifferent 

teacher 




Socialized 

recitation 

Socialized 

recitation 






Skillful 

teacher 


FIG. 60. DEPARTMENT OF CLASS ROOM TEACHERS’ ABBREVIATED 
RATING PLAN FOR DRILL TECHNIQUE 


This drill-technique rating plan is essentially a six-point cumu¬ 
lative scale in which the rating factors are assumed to be of 
equal importance in effective drill work, and are further 
assumed to be the most important factors related to effective 
drill work. 

Monroe and Clark ^ developed a composite method for rating 
teachers which includes achievement of pupils, personality, 


» Monroe and Clark. '‘Meaeuring: Teaching Efficiency”; in Educational Research Bul¬ 
letin, University of lUinoia, no. 25, 1924. 
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intelligence, experience, and training. The plan is presented 
in Figure 6i. 




Ratings 

Totals 


Minimum 

Median 

Maximum 

I. Achievement of Pupils 
(Basis of I.Q.)....... 

50 

100 

150 


II. Personality 

(Man-to-Man, Scale).. 

13 

44 

76 


III. Intelligence 

(Standardized Scale) . 

10 

Norm for teach¬ 
ers considered 25 

40 


IV. Experience of Teachers 

0 

4 points for 
each year up to 6 

24 


V. Training 

Elementary 

0 

2 points for each six weeks 
at a normal school or col¬ 
lege 


High 

D 

5 points for each year of 
20 weeks at a college or 
normal schtxil 



FIG. 6l. MONROE AND CLARK COMPOSITE PLAN FOR RATING 

TEACHERS 


Ratings are expressed in total scores from which the nature and 
extent of differences in probable teaching efficiency are inferred. 

Courtis developed comprehensive descriptions of what he 
designated as compulsion, preparation, motivating, and pur¬ 
posing types of teachers; teacher rating consists of identifying 
subjects rated with one of these types. Brueckner improved 
on Courtis’s technique by developing a scale of descriptions 
within each type described by Courtis; the average displace¬ 
ment of second ratings from first ratings is less than seven 
per cent of the range of the scales. 

The last appraisal device to be presented is the graphic 
rating plan, or scale, widely used in business, government, and 

* Brueckner, L. J. “ScsaleB for the Rnting of Teaching SkUl”; in Bducatioml Resiorch 
BuJUtin, Unlverrfty of Mlnnmta, no. la, 1937, 
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industry, as well as in education. An illustration of this plan 
is given in Figure 62. The essential features of such a plan 
are as follows: 

1. Lines which represent the total range of manifestation of a trait, 
quality, or ability. 

2. Descriptions of definite points on this range; usually three or five 
points are defined. 

3. Numerical values for points on each item range. 

4. The reliabilities of individual items of the device are sometimes 
computed and indicated and the correlation of the total score 
with outside criteria determined. 


Teaching i 2 

Type I 1 

3 4 

1 1 

5 

1 

Leads, inspires, mo¬ 
tivates; encourages 
pupil activity 

Individual 

Differences | I 

Ordinary; combines 
motivation and com¬ 
pulsion. 

1 1 

Compulsion type; 

does all judging, ini¬ 
tiating, instructing, 
etc. 

I 

Knows strength and 
limitations of every 
pupil! treats them ac¬ 
cordingly 

Knows portion of her 
class; does not un¬ 
derstand others; 

provides special ma¬ 
terials for some of 
the bright pupils 

Does not recognize 
individual differ¬ 

ences; teaches at the 
class; uses mass in¬ 
struction 

Aims I [ 

1 1 

1 

Knows objectives of 
each phase of educa¬ 
tion; utilizes methods 
to attain moat desir¬ 
able-outcomes 

Conscious of many 
aims of education, 
but they do not al 
ways function 

No knowledge of 
aims; blunders; may 
accidentally attain 
some objectives 

Motivation \ I 

I ( 

1 

Inspiring; makes pu¬ 
pils conscious of ob¬ 
jectives; uses graphic 
aids; praises 

Gives many direc¬ 
tions; repeats; re¬ 
minds pupils of con¬ 
sequences of failure 

Inspires fear; threat¬ 
ens punishment for 
slight offenses; 

threatens failure 

Skill I 1 

1 1 

1 

Expert; resourceful; 
utilizes instinctive 
tendencies; makes 

objectives attractive 

Average; pupils 

make some contribu¬ 
tion; many work in¬ 
dependently with 

fair success 

No pupil participa¬ 
tion; teacher con¬ 
ducts and controls; 
work mechanical; 

pupils helpless; no 


78 


74 


.76 


74 


8z 


results 


FIG. 62. ILLUSTRATION OF GRAPHIC RATING BLANK 

From Tiegs, E. W. An Evaluation of Some Techniques of Teacher Selection. 
ington. Public School Publishing Company, ipaS. 


Bloom- 
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The numbers to the right of each item in Figure 62 represent 
the reliabilities of single items; the reliability of the total 
forty-one-point blank from which these items are taken is 
.93 ± .008. The Almy-Sorenson Rating Scale far Teachers,^ 
and the Dragoo Rating Scale for Shop Teachers,^ are of this type. 

Check lists are frequently substituted for other plans of 
evaluation, varying widely in the number of items and the 
relation of items to learning activities. The following is a 
typical list of items; 

1. How does the teacher receive her da^? 

2. Are materials at hand, plans for work present, and administra¬ 
tion of passing materials worked out? 

3. Is classroom work begun promptly? 

4. What is the attitude of the pupils toward the teacher? 

5. Is there evidence of careful planning? 

6. Are aims clcjirly stated? 

7. To what extent is the present recitation tied up with preceding 
recitations? 

8. How much illustration material is used? 

9. Are good habits of work in evidence? 

10. Do children waste time? 

11. Are practical results in evidence? 

12. Dexjs class judge own results and form conclusions? 

Usually the rater checks those items on which the teacher 
passes, counts them, and marks the teacher accordingly. Such 
lists are also used as the basis for conferences between teachers 
and rating officers. 

Evaluation of rating techniques. Objections of teachers, 
most of which are justified by the facts, have already been 
presented. Evidence against most traditional rating plans is 
overwhelming and conclusive. Such plans are almost entirely 
appraisal in nature, and have had little relation to activities 
in which the teacher engages. 

Another difficulty in the use of rating blanks or scales is 

* Pabllthed by the Public School PublMilng Company. Bloominjrton, Illinois, tp3p. 
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their confusion with measurement instruments. Developing 
blanks to record the same old unreliable guesses and estimates 
does not in itself improve ratings, but has made it possible to 
investigate them and to reveal their shortcomings. 

Where the general type of rating plan, illustrated in Figure 58, 
is used, it frequently happens that the same subject is rated at 
opposite ends of the scale by different raters- The items of 
such rating scales constitute such vague concepts that they 
allow wide variations in interpretation. 

Raters rather consistently rate small groups of teachers on a 
limited number of items, and point to this as evidence of the 
reliability and value of the rating scale. For the most part, 
this is simply frozen subjectivity, or the “halo” effect of gen¬ 
eral impressions. Apparent consistency in teacher rating is 
often the result of copying, sometimes with slight modifica¬ 
tions, the ratings previously given. 

Utilizing the * 5 . and U. technique simply reduces the possi¬ 
bility of variation in rating. It does not in itself improve the 
ratings, which, when made, do not aid in the improvement of 
teaching. The per cent and weighted-score systems are no 
more reliable or valid than any other. What does a rating of 
eighty-five per cent in general intelligence, or 200 in teaching 
ability, mean? What ought teachers to obtain in these rat¬ 
ings? And how do these ratings point out teaching difficulties 
which must be eliminated if teaching and learning are to be 
improved? 

The man-io~man scale is based on a fallacy which sometimes 
results in having the same teacher who served as the horrible 
example on one scale also serve as the exemplification of the 
finest qualities on another. It is based on the assumption 
that, by choosing scale teachers (an admitted impossibility to 
begin with, otherwise no scale would be needed) and rating 
other teachers on this unreliable basis (which multiplies 
errors), relatively accurate results are obtained. 
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There is no evidence to indicate that the choice of factors is 
correct in the Class Room Teachers type of drill-technique rat¬ 
ing plan. The nature of available materials is often beyond 
the teacher’s control. Neither is there any evidence that the 
factors used are of equal value. Ineffective direction of learn¬ 
ing may be due to misuse and varying combinations of factors 
rather than merely to the absence of some of them. 

The Monroe and Clark type of plan calls attention to the de¬ 
sirability of considering additional factors which may be useful 
in appraising teachers, but, like the others described, it has 
little to do with diagnosis of difficulties or improvement of 
teaching. Personality cannot be dealt with by the plan sug¬ 
gested, the man-to-man scale. There is little correlation be¬ 
tween teacher I.Q. above a certain undetermined minimum, 
and so-called teaching success. Neither training nor experi. 
ence, both of which vary greatly, reliably predict success in a 
given situation. 

Descriptions of different types of teaching usually bury rathev 
than reveal the facts regarding teaching difficulties even though 
they may be relatively reliable. Because of their length and 
nature they have not been developed in connection with most 
of the important factors presumably related to teaching and 
learning. 

The graphic rating scale can be constructed to yield higher 
measures of agreement among raters, and between first and 
subsequent ratings of the same raters. It may therefore be 
said to have a generally higher reliability, but its significance 
cannot be determined because no reliable criterion of teaching 
success exists against which to test its validity. Like almost 
all rating plans, but to a lesser degree, graphic rating scales so 
far produced call for ^’conclusions of the witness” on every 
item instead of the recording of observable and identifiable 
facts which may later be considered in tota For example, the 
item, ’’knows objectives of each phase of education,” under 
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aiw-s in Figure 62, calls for a judgment which depends first upon 
whether there are definite objectives; secondly, upon whether 
or not the rater knows them; and thirdly, upon whether or not 
evidence of their presence or absence is sufficient basis for a 
reliable judgment. 

In general, check lists have the same weaknesses regarding 
conclusions. In addition, they lack the descriptions of the 
different points on the graphic scale, which make differentia¬ 
tion possible. They assume, furthermore, that certain evi¬ 
dences are favorable or unfavorable without recourse to actual 
results obtained. For example, some children always waste 
some time (as do adults) in every classroom, hence no teacher 
or few teachers would ever receive a favorable rating on this 
point, even though nature is to blame, and not they, and 
measured results are satisfactory. 

Finally, while ratings may serve as a basis for conferences 
between teachers and rating officers, rating plans and scales 
thus far developed still deal too much with general concepts 
far removed from the specific problems of teachers. If their 
validity is assumed, they may be used for appraisal purposes, 
but they give inadequate aid in the improvement of teaching. 
We approach our problem as if we judged the quality of coffee 
by studying the china in which it is served, or measured the 
dimensions of the sales room to guarantee the fit of a coat. In 
trying to establish a criterion for teaching success we ask a 
hundred people who do not know, average their ignorance, and 
produce a definition which is as far removed from the problem 
and as useless as ever. We consider virtually everything except 
pupil attmnment, the purpose for which schools are estab¬ 
lished. 

Teaching efl 5 .ciency. Teaching efficiency should be defined 
as the extent to which teachers aid learners to attain the ob¬ 
jectives of the curriculum. A number of related factors such 
as the ability-level of the class, the previous attainment of the 
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class, previously learned work habits of the class, and equip¬ 
ment and supplies, all influence achievement and should be 
taken into consideration. Ideally, therefore, teaching success 
or efficiency should be expressed primarily in terms of changes 
in pupils, rather than in terms of traits and qualities which 
largely defy analysis. 

It is true, however, that the above factors complicate the 
problem of determining when teaching is effective, and, more¬ 
over, that many administrative and rating officers have had 
little training in measurement and can make no adequate eval¬ 
uation of outcomes. But there is another method open to those 
who must evaluate teachers which also provides the basis of 
improving learning. This method, known as the teacher- 
activity method, should be used for diagnostic purposes in con¬ 
junction with evidence of actual pupil attainment. It will be 
described later. 

Intelligence and achievement tests. Intelligence-test scores 
have been found to have no close relation to the variety of 
activities involved in teaching success. This may be due, 
among other things, to the fact that these activities do not 
require a high degree of mental maturity; or to the fact that 
teachers as a group are highly selected; or it may simply reflect 
the inadequacy of the intelligence test as a criterion. Achieve¬ 
ment tests in educational psychology, and in subjects such as 
English and mathematics, also have little relation to teaching 
success as represented by ratings of supervisors and principals. 
Although such tests reveal statistically significant differences 
between teachers in knowledges and understandings, these 
factors alone are apparently not the crucial ones in teaching 
success. 

Professional tests for teachers. A number of workers have 
devised standardized tests which are more closely related to 
teaching activities than are rating scales. Such tests, if signifi¬ 
cantly related to teaching efficiency, would be helpful in teacher 
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selection even if they were of little value in the diagnosis of 
teaching difficulties. 

Among the tests developed are the following: 

1. The Moss, Hunt, Wallace Teaching-Aptitude TestJ This 
test was developed by the Department of Psychology of the 
George Washington University, and consists of five parts: 

Part I. Judgment in teaching situations (60 multiple-choice items) 
Part 2. School problems (80 true-false items) 

Part 3. Comprehension and retention (20 questions) 

Part 4. Recognition of mental states from facial expressions (20 
multiple-choice items) 

Part 5. Observation and recall of details of a fight (20 questions) 

The test requires forty minutes; norms are available for high- 
school seniors, normal-school students, teachers’ college grad¬ 
uate students, and experienced teachers. The latter are 
further subdivided into city, town, and country divisions. The 
coefficient of reliability, chance-halves method, corrected by 
the Spearman formula, is reported as .91 for 100 teachers. The 
correlation between this test and the George Washington Social 
Intelligence test is .60. Correlation between this test and 
supervisors' judgments of teaching efficiency ranges from .30 
to .50. 

2. The Coxe-Orleans Prognosis Test of Teaching Ability.^ 
This test was designed primarily to select and classify students 
in teacher-training institutions, but it is also used in selecting 
teachers for appointment. It consists of 410 items arranged 
in five parts as follows: 

I. General information (78 items) 

11 . Professional interest (lOO items) 

III. Statistics, tests and measurements, psychology, etc. (99 
items) 

IV. Reading comprehension (63 items) 

V. Problems in education (70 items) 

* Moss, F, A., Hunt, T., and Wallace, F. C. Teaching-Aptitude Test. Washington, D.C., 
■Jenter for Psychological Service, 1937. 

» Published by the World Book Company, Yonkers-on-Hudson, New York, 1930. 
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Some evidence of validity of the test is indicated by the data 
of Table 32, which show that this prognosis test of teaching 
ability predicts success better than any other measure except 
mental ability. 

Table 32. Correlation of Various Tests (Given to Students 
Entering Teacher-Training Institutions in New York 
State) with a Comprehensive Achievement Test in Normal- 
School Work Given at the End of the Year 


Test 
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.369 
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New York Rating 
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.416 
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.339 

.193 

.305 


Hlglt-School 


1 

Marks.......... 

.391 

BSISI 

-S33 


.6a6 

mm 

.539 

.349 

.409 


No data on reliability are given, but the length of the test 
and the manner of its construction should guarantee satisfac¬ 
tory performance in this feature. A table indicating proba¬ 
bilities of success at each level of achievement is included in the 
manual of directions. 

3, Tke Knight, Bathurst, Ruch, Telford Professional Tests,^ 
This test as first issued consisted of six parts: 

Part I. Professional Judgment (37 multiple-choice items) 

Part II. Theory and Practice of Teaching (67 true-false items) 
Part III. Reading Comprehension (27 multiple-choice items) 

Part IV. Social Information (26 multiple-choice items) 

Part V. School and Class Management (67 true-false items) 

Part VI. Professional Information (34 true-false items) 

* Knight. F. B.. Bathor«t, J. E.. Ruch. G. M., and Telford, Fred. *‘Slandatdijted Test 
tor Elexnentary T^icbera*'; In Pahlic PisrsottHnsl Studies , p , aSa, (Ctetober. 
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The following are the hrst five items of Part V, on School and 
Class Management; 

I. Employment of a child at home by its parents is sufficient 


excuse for its absence from school. X F 

2. In drilling, it is better to give the exercises to the class in 

unison rather than individually. X F 

3. A good question should stimulate reflection on the part of 

the student, X F 

4. Not more than ten minutes of each period should ever be 

‘devoted to drill exercises. X F 

5- In general, an oral is better tlian a written test. X F 


The first five items of the test of Professional Information 
are as follows: 

Directions: Answer each of the 33 statements below by encircling the 
T or the F at the right as you did in tests III and V. 


Do not guess, 

1. All pupils are endowed with equal musical ability. X F 

2. In teaching a pupil language expression, his attention 

should be called to those aspects of his language in which 
errors lie. X F 

3. Terman is the author of the “Stanford Revision of the 

Binet-Simon Xests.’’ X F 

4. Stone is the author of “Reeisoning Tests in Arithmetic.” T F 

5. A sensation is a conscious reaction that is occasioned by a 

definite stimulus. X F 


The revised test, under the title of AptiPude Tests Jor Elemen¬ 
tary Teachersj^ was organized under the following six captions: 

Test I, General information 

Test 2, Educational psychology 

Test 3. Reading comprehension 

Test 4. Social information 

Test 5. School and class management 

Test 6. Professional information 

Evaluation of professional tests. Although professional 
tests report reliabilities ranging from .75 to .90, they have not 

* PubUihed by Bureau ot Public PerBonnel Administration, Waahlneton, D.C„ I 92 S- 
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proved particularly useful in predicting teaching success, as 
viewed and rated by those who actually evaluate teaching. 
The cause may be an inferior selection of test items in relation to 
actual teaching activity, or it may be almost entirely evidence 
of the various and often conflicting activities which have passed 
as teaching success. In addition no test, however reliable, can 
exactly duplicate the varying conditions of the classroom; 
hence the variable context in which the teacher-personality 
functions will operate to reduce the prediction-values of tests. 

Finally, there is a constant tendency for tests to become 
unfitted to the needs of a community. Unlike tests of funda¬ 
mental skills in reading or arithmetic which are relatively 
stable and universal, philosophies and techniques of teaching 
are subject to wide variations. To the extent that tests hold 
teachers to basic values, they are good; to the extent that they 
fall behind real progress in the theory and practice of teaching, 
they are bad. The Los Angeles City Schools * utilize a card 
catalogue system of teacher-test items which are studied indi¬ 
vidually after each examination. In this way there is continu¬ 
ous revision, faulty items are eliminated, new items are added, 
and a file of validated items which represent the latest theory 
and practice is always available. 

Pupil results and the accomplishment ratio. When the view¬ 
point that pupil results were the most fundamental evidence of 
teaching success first gained acceptance, there were many 
attempts to translate this philosophical belief into a practical 
plan or technique of administration. Among these plans was 
the accomplishment ratio (A.R.) or accomplishment quotient 
(A.Q.) method. 

The total A.R. is the ratio between the E.Q. (educational 
quotient) and the I.Q. For any one subject it is the ratio be¬ 
tween the subject age and the mental age. 

* Toolln, W. T., and Humm. K. A. AHolysis of Indimdual Hems for the ImproBemenl of 
Test Conslructim, Publisbad by the Board of Education, Lo« Angeles City School District. 
Bersonnei DivWon, i©37* 
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Subject ages are derived in the same way as mental ages. 
A reading age of 10 years, 5 months, for example, indicates 
reading ability equal to the average of those who are 10 years, 
5 months old chronologically. If mental age also happened to 
be 10 years, 5 months, the accomplishment ratio for reading 
would be 1.00 or 100, since decimals are dropped as in I.Q.’s. 
But an M.A. of 15 years, 7.5 months, and a reading age of 
10 years, 5 months, would yield an A.R. of only 66^ in reading. 

Other quotients and ages involved may be indicated as fol¬ 
lows: 


I. I.Q. 


M,A. 

C.A. 


2. Total EA- 


3 . e.q. 


Sum of subject ages 
Number of subject ages 

_ 

"■ C.A. 


4 - 


A.Q. or A.R. 


E.Q. or Subject age 
I.Q. or Mental age 


In determining teaching success by this method the initial 
A.R. was subtracted from the final A.R. for the class as a 
whole. If the result was near zero, the teacher was considered 
average; if negative, below average; and if positive, above 
average. 

However, in both the total A.R. and A.R. for a single sub¬ 
ject, a measure of intelligence is the divisor. This technique 
therefore assumes that intelligence as represented by the M.A. 
or I.Q. is the sole determinant of achievement, which is con¬ 
trary to the known facts; evidence already presented shows 
that the correlation between intelligence and achievement 
averages about .45 to .50 and is only about 13 per cent better 
than chance. Pupils with similar M.A.'s and I.Q.’s are 
awarded all marks from failure to superior. Many factors 
other than M.A. or I.Q., such as effort, opportunity, work 
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habits, and teaching procedures, are ignored when intelligence 
is wrongly assumed to be the sole determinant of achieve¬ 
ment. 

Workers soon discovered a negative correlation between 
A.R. and I.Q.; pupils with low I-Q.’s usually have high A.R.’s, 
whereas pupils with high I.Q.’s usually have low A.R.'s. Early 
writers believed that this negative correlation was due to the 
fact that duller pupils had to work harder to get along and that 
brighter pupils needed to put forth relatively less effort, hence 
the A.R. was hailed as a measure of the extent to which pupils 
worked up to capacity. However, observation does not sup¬ 
port this claim that dull pupils work more faithfully or more 
diligently as a group than do bright pupils; in fact, there is evi¬ 
dence to believe that the opposite is true. Furthermore pupils 
with identical A.R.'s exhibit a wide range of effort so far as can 
be observed. Finally, according to evidence, the assumption 
that similar effort, as such, is the crucial causal factor in learn¬ 
ing when combined with similar or even identical I.Q.’s is 
untrue. Actually, learners with identical I.Q.'s vary widely in 
achievement even though they appear to put .forth the same 
effort. 

The real reason for the negative correlation between A.R. 
and I.Q. lies in the nature of these measures themselves. The 
A.R. is a ratio resulting from two quantities, both of which are 
subject to errors of measurement. The probable error of an 
I.Q. is generally about 5 I.Q. points, which means a possible 
variation in extreme cases of 20 points in either direction or a 
total range of 40 points. (A normal distribution ranges through 
eight probable errors.) The E.Q. is subject to similar possible 
variation. Thus variations of even one probable error, if they 
happen to be in opposite directions, change the A.R. consid¬ 
erably, as may be seen from the following example: 


E.Q. of 100 
I.Q. of 100 


= A.R. of 100 


I. 
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2 95 

I.Q. of 105 


A.R. of 90 


3. 


E.Q. of 105 
f-Q- of 95 


= A.R. of no 


Errors in either term may vary from r to 15 or 20, and in 
either direction. Coy ^ found great difficulty in obtaining 
reliable A.Q. changes because of these and other disturbing 
influences. Crabbs ^ used the A.R. method for estimating 
teaching efficiency, and obtained a low correlation between 
A.Q. changes and estimates of teaching efficiency of 64 teach¬ 
ers. Such a result was inevitable because of the unreliability 
of both A.Q. changes and the criterion. Thus the A.R. tech¬ 
nique of utilizing pupil achievement as a measure of teaching 
success defeats its own purpose and other methods must be 
found to relate pupil growth and teaching success. 

Diagnosis vs. appraisal of teaching. The techniques so far 
considered, for evaluating teaching and teachers include score 
cards, rating blanks, check lists, intelligence and achievement 
tests, professional teaching tests, and the A.R. technique. 
These have in the past been used primarily as appraisal tech¬ 
niques in the effort to determine teaching efficiency. In gen¬ 
eral they are too far removed from teacher problems and 
activities to be useful for the diagnosis of specific teaching 
difficulties and the improvement of teaching and learning. 
Many of them have no value even for appraisal purposes be¬ 
cause of their unreliability. Even when the problem of relia¬ 
bility for a given instrument, such as a graphic rating scale 
or a test, appears satisfactory, predictions based on these scores 
have little relationship to the ratings given later by those who 
judge teaching. 

^ Coy, G. L. “Study of Various Factors Which Influence the Use of the Accomplishment 
Quotient as a Measure of Teaching Efiiciency''; in Journal of Educational Research, vol. 21, 
no. I. (January, 1930.) 

» Crabbs, L. M. Measuring Efficiency in Supervision and Teaching. Teachers College 
Contributions to Education, no. 175- i925- 
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Two additional possibilities have already been suggested, 
that is, the use of teacher-activity units, and an approach to 
the use of pupil results different in nature from the A.R, tech¬ 
nique. The basic principle of the teacher-activity-unit method 
is a common understanding between teacher and rating officer 
of the specific important activities in which the teacher should 
engage; rating then consists of checking teacher activities 
against these standard activities. The basic principle of utiliz¬ 
ing pupil results to determine the adequacy of teaching is to 
measure the factors and combinations of factors related to 
learning which differentiate groups and classes, and to deter¬ 
mine experimentally the varying nature and magnitude of 
achievement which may reasonably be expected from these 
different groups. These are not mutually exclusive or anti¬ 
thetical procedures; the first is primarily diagnostic, and the 
second, primarily an appraisal technique. If they are valid, 
however, they should lead to the same conclusions. 

Teacher-activify units. The desirability of developing and 
using effective pupil activities is widely accepted. It seems 
logical, therefore, to attempt to develop similarly effective 
teacher activities, both in the interests of learning and for more 
adequate evaluation of teaching. This immediately raises the 
question of freedom, initiative, and originality of teachers, a 
question dear to the hearts of those who defend and inspire 
teachers on public institute platforms. Actually, members of 
this group often know very little about classroom activities, 
and would fail, or have failed, as directors of learning. 

In one sense freedom is an illusion. The more definite the 
course of study and the selection of materials, the less freedom 
remains; the more definite the procedures, the less freedom and 
opportunity for initiative and originality. In another sense 
this is the most desirable kind of freedom, since it relieves the 
teacher from the necessity of attempting to improvise as she 
goes along and frees her to direct her energies along worthwhile 
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channels. Rails limit the freedom of the train, but they direct 
activity to the right destination. In the same way, properly 
developed activity units should free teachers for more adequate 
direction of learning. Recent educational history is surfeited 
with examples of originality. Too often they have been mere 
departures from a known and dependable psychology of learn¬ 
ing to something strange, unfamiliar, and exciting, mere leaps 
from the known to the unknown, which reflected the confusion 
and mental attitudes of the profession rather than the needs of 
children and youth. It is time that we had less originality and 
more professional adequacy. 

Teacher-activity units are organizations of the best known 
procedures for directing learning activities. We demand that 
the surgeon who removes a brain tumor use the best-known 
method; we would avoid a surgeon who insisted on being 
original on the ground of personal rights. When research and 
experience have revealed a best-known method of attaining a 
certain educational end, this method should be adopted and 
used. 

Where no test procedures are known, teachers, supervisors, 
administrators, and authorities should pool their knowledges 
and convictions and agree upon the details of activity units. 
An essential condition is that these organizations of procedures 
should be accurately known and understood by both teachers 
and rating officers to improve the possibility of valid evalua¬ 
tion. And they should be committed to writing and remain 
standard until changed. Schools regularly keep written 
records of the most diverse, voluminous, and often trivial mat¬ 
ters, Yet some of the most important factors in learning have 
been left largely to chance. 

Teacher-activity units are not mere theoretical or philo¬ 
sophical projections of the imagination. Progress has already 
been made in the skills and social studies. To clarify the con¬ 
cept the following illustration is offered; 
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TEACHER WEEKLVT ACTIVITY UNIT IN SPELLING 

Monday 

1. Give a pre-test on words to be studied. 

2. Direct the scoring of papers. 

3. Have each pupil make a personal list of words missed. 

4. Make a list of words missed by more than fifty per cent of the 
class and teach them by this method: 

a. Develop meanings of each word as studied. 

b. See that pupils examine each carefully and obtain good vis¬ 
ual images. 

c. Pronounce word and have pupils pronounce it while looking 
at it. 

d. Have pupils pronounce and write word while looking at it. 

e. Have pupils pronounce word and write it from memory. 

f. Have pupils compare word with correct copy, and continue 
(3 to 5 times) writing from memory until they have mastered 

it. 

Tuesday 

1. Aid pupils in studying personal word-lists. 

2. Direct review study of pupils who made no errors on Monday 
list. 

3. Study words missed on final Friday tests. 

Wednesday 

1. Give a test on words missed on Monday and any other written 
work during the week. 

2. Direct scoring of papers. 

3. Have each pupil make a list of words missed. 

4. Teach words missed according to above plan (Monday, No. 4). 

Thursday 

1. Give a test on review words. (Words missed previously on 
Friday tests.) 

2. Direct scoring of papers. 

3. Direct revision of review study lists. 

4. Direct study of words missed on Wednesday and Thursday 
lists. 
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Friday 

1. Give final test on new words for week. 

2. Direct scoring of papers. 

3. Record pupils’ scores for week. 

4. Have pupils add words missed to review lists. 

5. Direct study of words most difficult for class as a whole. 

Parts of the above unit are the result of extended research of 
many workers. Other features are contributed and upheld by 
many teachers and administrators who have taught spelling 
successfully. It constitutes a unit or frame of reference within 
which learning may be directed, but it still allows teachers a 
good deal of freedom in directing certain activities. As evi¬ 
dence makes refinement of procedures possible, the unit may 
become more definite. If the principal desires to rate the 
teacher he may check her direction of activities against the 
activities of the unit. If a standardized spelling test or the 
weekly reports of percentages of correct spelling indicate un¬ 
satisfactory achievement for the class, he may use this activity 
unit as the point of departure in locating teaching difficulties. 

Such a unit should be followed in a school or school system 
until changes are made. Research done by teachers or re¬ 
ported in professional literature, information obtained from 
lectures and university classes, and the like should be gathered 
constantly, and changes made in accordance with the best 
evidence available. Teachers and rating officers should have 
copies of the units in force, and evaluations of teaching success 
and diagnosis of teaching difficulties may be made in terms of 
this common understanding. Similar units have been devel¬ 
oped for writing, music, and reading. Units recently developed 
in social studies, industrial arts, and science * may easily be 
used as the basis for developing teacher-activity units. Well- 
considered direction of carefully developed procedures based 

' See Tiegs, E W. The Management of Learning in the Elementary Schools. New York, 
Longmans, Green, and Company, 1937. 
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on the needs and capacities of learners may be substituted for 
the impulses, originality, and improvisations of the moment. 

The development of such units will force a clarification of 
objectives, materials, and procedures in fields in which teach¬ 
ing cannot now be evaluated because no one knows just what 
the teachers are trying to do or how they are succeeding. Such 
units require answers to the questions, What do you do? and 
Why do you do it? Whether or not the correct answers are 
at once supplied, it makes possible the evaluation of teaching 
in terms of what teachers are consciously trying to do. 

Diagnosing teacher difficulties. A complete diagnosis of 
unsatisfactory progress of a class sometimes involves a study of 
the teacher herself. Among the factors which should receive 
attention are the following; 

1. General health. Even though teachers are Class A insur¬ 
ance risks, the conditions of the classroom constitute a heavy 
strain on general health. A complete physical examination 
should be required or supplied periodically and made available 
for inspection of responsible officers. 

2. Vision and hearing. Eye strain and defective hearing are 
two potent causes of difficulty. Teachers are often unaware 
of them. 

3. Mental health. A physician' recently reported that 1500 
of New York’s 37,000 teachers were psychopathic or definitely 
insane. Teachers who constantly quarrel with their students, 
colleagues, supervisors, and administrators, or who exhibit ex¬ 
treme withdrawal behavior should receive the attention of qual¬ 
ified physicians. In general, the same methods which teachers 
use in the diagnosis of adjustment difficulties of students 
(see Chapters Five, Seven, and Nine) may be used by adminis¬ 
trative officers in the analysis of teacher-adjustment difficulties. 

4. Teaching load. The crude methods used in assigning 
Work results in unreasonable loads for some teachers. Relative 

* Ncu) York Times, March 27, 1934, P- 3- 
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weights of different types of work combinations have been 
studied,^ and should be considered in assignments. 

5. Teaching subjects or fields. Some alleged failure of teach¬ 
ers is due to administrative problems; beginning teachers in 
cities of less than 5000 are regularly required to teach in fields 
in which they have had no preparation. While current meth¬ 
ods of determining teaching success by estimate and opinion 
are so unreliable as to reveal no high correlation with any pre¬ 
dictive instrument, the question of proper assignment should 
be raised before a teacher is adjudged a failure. 

6. Financial difficulties. Financial worries are a potent 
cause of worry and misunderstanding. Teachers frequently 
carry heavy responsibilities. A knowledge of the causes of 
difficulties enables a school official to render fairer judgments. 

7. Improvement programs. No member of a developing and 
changing profession can hope to give adequate service without 
a program of personal improvement, consisting of reading, 
travel, university classes, and the like. Some teachers have not 
done any serious study in twenty or thirty years. Records of 
such programs should be available to supervisors and principals. 

8. Aid from supervisors. Some schools and school systems 
have no schedules for supervisory assistance, with the result 
that supervisors spend their time with a limited number of 
teachers in a limited number of buildings. Records of visits 
of supervisors, with their findings and suggestions, should be 
available. 

Suggested program for evaluation of teaching. At present 
it is generally impossible either to diagnose teaching difficulties 
or to appraise total teaching success or adequacy. Too many 
erroneous convictions are held by rating officers, and too many 
useless and unreliable plans of evaluation have become en- 

* Woody, ClifEord, and Bergman, W. G. “The Measurement and Equalization of Teach¬ 
ing Load in the High School"; in North Central Association Quarterly, I, pp. 339-58, 1926, 
See also Tritt, W. W,, and Keyes, M. M. “Estimating Teaching Loads by Means of Sub¬ 
ject Coefficients”; in Nation's Schools, no. 5, 1930. 
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trenched in the thinking and printed forms of school officials 
and instructors in teacher-training institutions, and both are 
difficult to change. Many factors appear to be related to 
teaching and learning; and evidence of them gathered through 
diagnosis and appraisal may be used for many different pur¬ 
poses. For these reasons a program of evaluation, rather than 
a method of rating, should be developed. 

The physician has no way of combining the results of various 
tests and observations into a final score or mark of eighty-six 
or ninety-two, representing the health of the patient. Such a 
score is neither logical nor useful, and may be positively mis¬ 
leading and dangerous. Yet school officials persist in the 
search for a device which will make it possible to express, in 
one quantity, relative magnitudes of the complex of factors 
related to successful teaching and learning. Such a single 
score conceals rather than reveals important factors with which 
supervisors and administrators should be concerned. Like the 
physician, supervisors and administrators should gather as much 
accurate, related evidence as possible, and then make judg¬ 
ments and decisions on an accurate picture of the total situation. 

The following suggested program will go beyond the present 
theories and practices of many school administrators, yet it 
represents a reasonable and attainable program in any modern 
school system. 

The program consists of five parts, as follows: 

1. A pupil-testing program, revealing results of teaching in terms 
of pupil growth and development 

2. Teacher-activity units for the appraisal of teaching and the 
diagnosis of teaching difficulties 

3. Graphic rating scale for sampling teacher characteristics 
deemed important in teaching and learning 

4. Professional tests for teachers, revealing changes in attitudes 
and ideals, and new knowledges and understandings related to 
“keeping up-to-date” 

5. Professional diagnostic conferences 
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Where pupils are making normal progress as revealed by 
valid tests, little if anything more is needed in the way of teach¬ 
ing evaluation. If the test results are unsatisfactory, the first 
assumption should not be that they are the results of poorly 
directed learning. The ability-level of the class, the definite¬ 
ness of objectives, the quality and quantity of materials and 
equipment, previous achievement and working habits of the 
class, and other factors may account for these results. 

The method of relating the average I.Q, of group or classes 
and achievement gains has already been presented.^ For sub¬ 
jects lasting but one year or one semester, similar experimental 
determinations may be made on the basis of gains between pre¬ 
tests and final tests. The fact that students are at, below, or 
above a norm proves nothing in itself. Only in the light of 
what particular groups should accomplish are judgments 
regarding the standards reached valid. Both actual achieve¬ 
ment and reasonable expectation must, however, be known, 
and may be determined through measurement. Certain other 
factors, such as course of study materials, supplies and equip¬ 
ment, hours of work, and the like, are standard, and may gen¬ 
erally be ruled out as factors causing differentiation in results. 
However, other factors, such as the per cent of retardation, 
extreme absence, neighborhood and home environment, and 
the attitudes and assistance of other teachers as well as of 
supervisory and administrative officers may be causal factors 
in results. At present we can only estimate their influence. 
The extent of their influences should be determined experi¬ 
mentally as soon and as accurately as possible. 

Many teachers and administrators do not yet understand the 
nature and function of tests and measurements. Again, our 
capacity for conceiving and imagining outcomes outruns our 
ability to prove their existence, and, in fact, often appears to 
far outrun all evidence of reality! Furthermore, if procedures 


* See Chapter Two. 
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are validated and organized on the basis of the results pro¬ 
duced, the results of teaching can be anticipated to a certain 
extent by the use of properly validated procedure. For these 
reasons, properly developed and validated teacher-activity 
units may be utilized as alternatives where no adequate testing 
program is operative. Such units may be used supplementary 
to good testing programs, furnishing a different type of evi¬ 
dence, and may be used also as independent diagnostic and 
appraisal devices. Undoubtedly, there are many occasions 
when administrative officers question the ability of teachers 
oh the basis of pupil results. A check-up against the teacher- 
activity unit may reveal that the direction of learning conforms 
to the procedures adopted for use. Pupil growth and develop¬ 
ment, however, are primary, hence the administrator (and 
teacher) look further to discover the causal factors in unsatis¬ 
factory progress. 

The graphic rating scale should be used to rate teachers 
rather than teaching. Teaching can be described and evalu¬ 
ated more accurately and usefully in terms of teaching-learning 
results and teaching-learning procedures. The qualities which 
appear on such graphic rating scales should be described as 
definitely as possible in terms of elements whose presence or 
absence can be definitely identified. The credits or scores 
assigned for varying manifestations of each quality should be 
based upon experimental studies of the degree to which teach¬ 
ers in general manifest such qualities, and not on the arbitrary 
judgment or judgments of rating officers. The graphic rating 
scale may be used as a supplementary device to measurement 
and activity-unit evaluation, but it has little value when used 
without evidence of pupil outcomes or ability to utilize pro¬ 
cedures. However, in the case of significant difficulty, a single 
item of the scale may sometimes provide an important clue. 
It is helpful in providing additional information to those who 
select or promote teachers, but little is known about the rela- 
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tionship between varying amounts of certain qualities and 
pupil progress. It is possible that when sufficiently compre¬ 
hensive graphic scales based upon experimental evidence are 
produced, the total scores on such scales may predict ability 
to direct learning, as measured by valid tests. 

Professional tests or examinations for teachers have been 
used primarily in training, selection, and promotion of teach¬ 
ers, but also have value for diagnosis. When teachers fail per¬ 
sistently in some area of responsibility, similar situations may 
be incorporated in diagnostic tests in hypothetical form, and 
teachers requested to tell what they would do and why. Other 
items of such tests may be framed to reveal the teacher’s 
knowledge of recent contributions to the theory and practice 
of teaching; problems related to the work which is being in¬ 
vestigated; and changes in attitudes toward work and ideals 
that have come as a result of experience, study, travel, and the 
like. Care should be observed in using standardized profes¬ 
sional tests for teachers. Items should be examined by quali¬ 
fied workers to determine whether or not they represent the 
best current knowledges and beliefs in theory and practice. 
The scoring key can sometimes be modified to eliminate the 
influence of obiectionable items. While the normative values 
of the results may be destroyed, a study of the errors of teach¬ 
ers having classroom difficulties may nevertheless be enlighten¬ 
ing. Furthermore, significant score differences may be used 
further to differentiate between teachers. 

No plan, either of appraisal or diagnosis, can be expected to 
succeed if it does not operate with full knowledge and co-oper¬ 
ation of teachers. For this reason the professional conference 
should be made a part of the diagnosis of teaching difficulties. 
There are still so many factors which necessitate a “conclusion 
of the witness,” instead of the recording of objective evidence, 
that the interpretation of the teacher should always be consid¬ 
ered before final conclusions are reached. The major purpose 



344 DIAGNOSIS AND THE DIRECTION OF LEARNING 

of sudi a conference should be the identification of causes of 
teaching difficulties. It should proceed on the basis of pupil 
results, activity-unit appraisal records, graphic rating, and pro¬ 
fessional tests, together with other related information. When 
difficulties become serious, and pupils, other teachers, parents, 
or members of the board of education become involved, care 
must be observed to obtain a fair sampling of opinion. Known 
friends as well as known enemies, if any, should be consulted. 
Since the interests of both teacher and administrative officer 
are involved, a professional diagnostic conference offers an 
opportunity for co-operation on the basis of objective evidence, 
rather than for strife and argument concerning gossip and issues 
which cannot be objectified. 

When the principal or supervisor suggests modifications in 
materials and procedures they should be based on something 
more than personal opinion. Where such modifications have 
been followed faithfully and results are still unsatisfactory, the 
search for causal factors other than procedures should be begun. 
Only when other factors have been ruled out and teachers are 
unable or unwilling to follow the modification recommended 
by superior officers, in the face of unsatisfactory pupil results, 
may the charge of inefficiency reasonably be made, 

QUESTIONS AND EXERCISES 

1. Outline the major objections of teachers to current rating plans. 

2. Describe and evaluate the most used types of rating techniques. 

3. Write your own definition of teaching efficiency. 

4. Explain your own concept of the difference, if any, between the 
appraisal of teaching, and the diagnosis of teaching difficulties. 

5. Discuss the values of intelligence, achievement, and professional 
tests in appraisal and diagnosis. 

6. Write your own evaluation of the A.R. technique and explain care¬ 
fully the reasons for your conclusions. 

7. What is meant by a teacher-activity unit? How does it diff«r 
from previous practice in rating teaching procedures? 
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8. Explain and evaluate the factors presented in diagnosing teacher 
difficulties. What modifications or additions would you make? 

9. Outline and evaluate the suggested program for the evaluation 
and improvement of teaching. 

lO. Write your own program for using measurement for the improve¬ 
ment of teaching. 
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APPRAISING THE EDUCATIONAL 
PROGRAM 




. CHAPTER FOURTEEN • 


NATURE OF VALID MEASUREMENT: 
ELEMENTARY CONCEPTS AND 
PROCESSES 

In considering the validity of measurement instruments, 
attention should be called to the fact that no perfect test 
exists in education any more than in medicine, or engineering, 
or In any other field. There is always a margin of error, large 
or small. Progress is made through the use and improvement 
of less valid tests, as well as through the development of new 
ones. The value of a measurement instrument should there¬ 
fore be judged by the extent to which it represents improve¬ 
ment over informal opinion and estimate and previous tests, 
as well as by its limitations. 

Objections to measurement. Certain objections to meas¬ 
urement in education are superficial and untenable, and show 
lack of knowledge of its nature or functions. Such criticisms 
can be removed only through education. The following are 
examples of these criticisms: 

1. Modern language teacher: Don’t need tests; I use the conversa¬ 
tional method. 

2. Attendance teacher: I.Q. tests are dangerous; besides, they rarely 
detect skippers. 

3. High-school principal: Tests do not show will or determination of 
students. 

4. College teacher; I can tell what mark to give when I see a student's 
work. 

5. Tenth-grade teacher: Tests do not detect potential criminals. 

Some objectors question the foundation of the whole theory 
and practice of measurement. They utilize a pseudo-scientific 
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but erroneous basis of argument which confuses objective pro¬ 
cedures and alleged outcomes, A favorite method of attack¬ 
ing measurement is to assert (not prove) that it is based on a 
psychology which assumes: 

1, That all children are born with the same qualities and limitations, 
and that any child can be developed into virtually any type of 
adult. 

2 , That the argument for measurement runs as follows: 

a. Study stimuli which produce certain responses. 

b. Organize stimuli to produce certain responses. 

c. Control responses (learning) through stimuli. 

d. That the whole process is mechanistic. 

3, That ideals, attitudes, instincts, loyalties, the “unconscious,” 
the hereditary, the congenital, and other influences are ruled out. 

The nature of measurement. Measurement, as such, is not 
interested in behaviorism or any other psychology or its vary¬ 
ing interpretations. It makes no claims concerning the edu¬ 
cability of individuals, but aims to reveal as accurately as 
possible evidences of actual traits, abilities, or achievement. 
Measurement is not concerned with the manner, its such, in 
which learning occurs, but with objectives or the desirable 
changes to be attained through educational activity, and with 
evidences of such attainment. Measurement rules out no 
objectives or influences; it merely requires its critics to be 
definite and specific regarding objectives, including ideals, 
attitudes, and loyalties and other vague alleged outcomes. 

The case for measurement may be summarized as follows: 

1, Schools are maintained to make desirable changes in individuals. 

2, Objectives must be specified in order to determine their desirabil¬ 
ity and attainability, in order to select materials, and to direct 
educational activity wisely. 

3, Not all desirable objectives are known at any one time; children 
and youths may experience desirable changes not previously sin¬ 
gled out for attention. When proof of desirability is satisfactory, 
these changes may be added to regular objectives, and similar 
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changes made in tests for outcomes. The most important but 
definite and specific objectives will receive most attention. 

4 . Measurement makes no assumptions regarding the nature of the 
learner; it reveals important aspects of that nature. 

5 . Measurement makes no assumptions regarding the efficacy of 
materials or procedures nor does it control or dictate them; it 
reveals evidences of outcomes from which such efficacy may be 
inferred. Incidentally, those who attack measurement through 
behaviorism defeat their own purpose; if responses (learning) 
cannot be influenced through stimuli (materials and procedures) 
no real reason for continuing schools exists. 

6 . Measurement as such is not concerned with initiative or original¬ 
ity; it merely reveals, indirectly, evidence of their existence or 
lack of it. Incidentally, it would appear that adequacy in at¬ 
taining important objectives through methods of known merit is 
as important or more important than through departures from 
such procedures; a civilization composed entirely of individualists 
would soon disintegrate. 

7 . Measurement makes no assumptions about the nature of out¬ 
comes; such assumptions are usually abundantly made by those 
who claim to produce them. There is little argument about the 
possibility of measuring facts, information, skills, and certain 
understandings, even though the value of facts and information 
as outcomes is frequently questioned. However, an examination 
of the lectures, assignments, discussions, and examinations of 
those who pay such high tribute to ideals, attitudes, loyalties, 
and “heart power” in their contributions to professional literature 
reveals that their efforts with students are confined largely to the 
mastery of facts and information. 

Measurement is not concerned with the arguments which 
confuse wishful thinking with evidence, or which choose to 
exalt the products of imagination and ignore concrete evi¬ 
dences of outcomes. Measurement does not attempt to trans¬ 
late a concept like loyalty or love into mathematical quanti¬ 
ties, It merely asks that the individual who claims to develop 
such outcomes be specific regarding their nature, and, if he 
cannot be literate and understandable regarding the specific 
nature of such outcomes, that he at least name the specific 
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aspects of behavior that may reasonably be accepted as evi¬ 
dences of such outcomes. If one individual manifests twenty- 
five evidences of loyalty, while another manifests only ten such 
evidences, under carefully controlled conditions, measurement 
holds that the quantitative concept is meaningful and useful. 

Growth of meastixement. Man left the caves when he began 
to count. The progress of civilization has been conditioned 
and often determined by growth in ability to do quantitative 
thinking. Beginning with no units of linear measure, man 
has developed and refined his concepts and instruments 
through the foot, inch, and micrometer-caliper which measures 
in thousandths of an inch; and even the latter measurement 
seems crude when compared to the discriminations possible 
through deflecting a beam of light. Light, which for years 
defied quantitative treatment, can now be measured accu¬ 
rately in foot-candles, and noise is measured in decibels. 
Ability to identify, measure, and control the phenomena in a 
given field is one good test of progress. Education has great 
possibilities, some of which are being realized. Progress will 
depend largely upon the extent to which teachers master and 
use more scientific tools. 

Functions of measurement. Some of the major functions of 
measurement have been discussed incidentally in a number of 
other connections; they will be illustrated more adequately in 
Chapter Eighteen. However, it will be helpful to review^ them 
preliminary to the study of the characteristics and validation 
of measurement instruments. Some of the major functions 
of measurement are: 

1. To reveal intelligence, ability, or mental-maturity levels of 
classes, schools, and school systems. 

2. To reveal the achievement levels of classes, schools, and school 
systems, 

3 * To provide the evidence necessary as the point of departure for 
diagnosis of learning difficulties and redirection of learning. 
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4. To provide data for use in various types of ability grouping and 
classification. 

5. To provide a partial basis for marking and promotion, 

6. To provide evidence on which to base decisions regarding de¬ 
sirability of modifications of, or additions to, objectives. 

7. To provide data which may be used in investigating the adequacy 
of materials and equipment, 

8. To provide data which may be used in studying the effectiveness 
of procedures. 

Major types of tests. Many different tests have been dis¬ 
cussed and used for illustrative purposes; a brief outline show¬ 
ing their nature and functions was presented in Chapter One. 
The student will find it profitable to review this outline at this 
point. 

Many other terms are used in describing tests. A rate test 
is one usually composed of items more or less equal in difficulty 
and has as its objective to measure the rate with which the 
tasks of the test can be successfully performed. A power test 
is one in which the items are arranged in order of difficulty; 
when items differ from each other in approximately equal incre¬ 
ments, the test is said to be scaled. The terms verbal and 
non-verbal are used to describe the extent to which language 
ability is involved in the correct responses to tests. A pre-test 
is a test of the objectives of a unit topic or subject, given before 
educational activities begin; the same test may be used later 
for diagnostic purposes. 

Duplicate forms. A standardized test usually has two or 
more forms; each consists of a group of test items which 
measure the same things and produce approximately the same 
results. Such duplicate forms are useful for testing sections 
of classes meeting at different hours or on different days. They 
may be alternated in the distribution of tests to classes, thus 
minimizing the temptation to cheat, as well as at the end of 
succeeding semesters or quarters. In addition, a duplicate 
."form is sometimes used in studying the reliability of the test. 
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The process of determining reliability will be described 
later. 

Time allowances. Standardized tests and parts of tests fre¬ 
quently have definite time limits. The object is to give every 
subject tested an opportunity to react, but few the oppor¬ 
tunity to complete all of the tasks. If all, or none, of the items 
have been answered, neither relative nor complete measure¬ 
ment of performance has been obtained. Where tasks are 
few and time limits are correspondingly short, careful tim¬ 
ing is nece^ary or the results of different tests will not be 
comparable. Time allowances are established in order that 
tests may be repeated as far as possible under identical con¬ 
ditions. 

Giving and scoring tests. Group tests of mental maturity, 
personality, or achievement may be given (and scored) by 
teachers. In general, valid objective tests may be scored by 
clerks, but teachers should study the individual te.st items of 
pupils experiencing difficulty. Scoring keys and definite in¬ 
structions for scoring are provided for good tests. Individual 
tests of intelligence should Ijc given only by psychologists or 
workers with special training, since the determination of suc¬ 
cess on individual items involves a measure of estimate and 
judgment. 

Tabulating test results. If only the mean is desired on a 
particular test, the familiar arithmetic average may be com¬ 
puted by adding the scores and dividing by the total number 
of scores. If the median only is desired, papers may be ar¬ 
ranged in order of size and the score of the middle paper used. 
When there is no middle paper (number of scores is even), the 
value between the scores of the two mid-most papers may be 
used as the median. 

However, when the number of test scores is large, or when 
other information is desired, the scores may be tabulated. For 
example, assume that a group of fifty-five students have taken 
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an intellig-ence test, and that the I.Q.’s obtained are as follows: 

114, 107, 114, 114, 109, III, 106, 97, 112, 139, 107, 109, 97, III, 
I32t 99t trS, 107, 96, 106, 123, 115, rr6, 120, 115, 86, 112, 109, 

115. 97. 100, 128, 126, 96, 115, no, III, no, in, 95, 102, 

106, loo, 112, 83, 114, no, 85, no, 91, 83, 85, 128, and 91. 
If they are tabulated they will appear as the first two columns 
of Table 33. 


Table 33. Illustration of Frequency Table and Computation 
OF THE Median, Mean, and Standard Deviation 


Scores 

/ 

d 

fd 

fd- 

(I) 

( 2 ) 

( 3 ) 

( 4 ) 

(s) 

135-139 

I 

5 

5 

25 

I30-134 

I 

4 

4 

16 

125-I29 

3 

3 

9 

27 

I 20-124 

2 

2 

4 

8 

II5-II9 

6 

I 

6 

6 

110-114 

15 

0 

0 

0 

105-109 

9 

— I 

-9 

9 

IOO-IO4 

4 

—2 

-8 

16 

95 - 99 

7 

—3 

—21 

63 

90- 94 

2 

—4 

-8 

32 

85- 89 

3 

—5 

-15 

75 

80- 84 

2 

—6 

— 12 

72 

Totals 

55 (-^0 


-45 ( 2 /d) 

349 (2/d9 


Mean = assumed mean + 2 /d /N X size of class interval 


= 112.5 + 




X 5 = 108.4 


Median = bottom of interval containing median 

, No. cases needed ... , , . ^ , 

+ ^--7 X size of class interval 

No. cases in interval 

= 110 + 5 = 110.17. 


Standard Deviation = 


sj ~ ( ^ interval) 


= 11.9 
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The rules for the tabulation of scores are as follows; 

1. Find the range of scores or difference between the highest and 
lowest score. (139—83 = 56, the range of scores.) 

2. Divide the range by a number such that the quotient will be not 
less than ten or more than twenty, data permitting. (Fifty-six 
divided by 5 »= 11 +, which satisfies the condition, and 5 may be 
used as the size of the class intervals.) 

3. Begin each class interval on a multiple of the size of the class 
interval. (Thus 80, 85,90, etc., are used as the beginnings of the 
intervals.) 

4. Tabulate each score in the order in which it occurs. Checks or 
lines may be entered after each interval as the tabulation proceeds 
and totals, only, entered in the final frequency table. 

Computiag Md and Q, Because the mean, standard devia¬ 
tion, and normal distribution are so closely related, they will be 
considered together. The median and quartile deviation are 
two of the more informal measures of central tendency and 
variability, respectively. They are frequently met in profes¬ 
sional literature, and will be briefly defined and illustrated. 

The median has already been defined elsewhere as the mid¬ 
score. Its computation may be illustrated by referring to 
Table 33. The steps in the computation are as follows: 

1. Divide the total number of frequencies by 2. (55 - 4 - 2 ® 27^^) 

2. Add frequencies from the bottom of the distribution (if the small¬ 
est score.s are at the bottom) until the interval which contains 
the mid-score is found. (The total of the frequencies including 
score log is 27j one-half of one frequency is therefore required 
from the 15 in the 110-114 class interval, which will contain the 
median.) 

3. Using as the numerator the number of frequencies required from 

the interval, and as the denominator the total number of fre¬ 
quencies in the interval, multiply this fraction by the size of the 
class interval. 15 X 5 ™ .17) 

4. Add to the lower limit of the interval containing the median the 

quantity obtained in operation 3. (no *f .17 110.17, th® 

median,) 
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The quartile deviation, or Q, is defined as one-half the dis¬ 
tance between the first and third quartiles and is computed in 
a manner similar to the median. Instead of finding the mid¬ 
point of the distribution by dividing Nhy 2, the first quartile 
(Qi) is found by dividing N by 4, and the third quartile (Qs) by 
taking ^ of N. These frequencies are then utilized to find 
the corresponding score values; Qi is subtracted from Q3, and 
the difference is divided by 2. The formula is as follows: 



2 


Referring to Table 33, the computation is as follows: 

1. Qs is at the ^ of 55 or the 41^ frequency and in the 
110-114 interval. Therefore, 

Qz = no + X 5 = II 4 - 75 - 

2. Qi is at the X 55 or frequency and in the 95-99 
interval. Therefore, Qi = 95 + 6^ j- 7 x 5 = 99.82. 

. ^ 114-75 - 99-82 ^ ^ 

3 - <2 --- 7-46 

2 

Computing the mean. The method of computing the mean 
illustrated in Table 33 is known as the short method, but it 
requires a longer time during the learning stage, or when the 
number of scores is small, than the old arithmetic-average 
method. The rules for computing the mean are as follows: 

1. Guess or assume a mean at the midpoint of some class interval. 
(The interval selected was no to 114, and its midpoint is 112.5.) 

2. Beginning with a zero opposite the assumed mean, add a d column 
(no. 3) showing how far the midpoint of each class interval deviates 
in a positive or negative direction from the assumed mean. (Note 
that these are class interval deviations and that since the size of 
the class interval is five, each deviation represents a distance of 
five score points.) 

3. Multiply each class interval deviation indicated by d (no. 3) by 
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the number of frequencies in the interval as shown in the/ column 
(No. 2) and enter the results in the/fi column (No. 4). 

4. Find the algebraic sum of the/d column. (In the present com¬ 
putation, this sum is —45. Had this sum been zero, it would 
have indicated that the mean guessed or assumed was correct; a 
— 45 indicates that the assumed mean is too high and that a cor¬ 
rection must be made.) 

5. A correction is made by dividing this sum by the total number 
of measurements, reducing it to score form by multiplying by 
the size of the class interval, and adding it algebraically to the 
assumed mean. (Thus, a —45 divided by 55 multiplied by 5, 
and added algebraically to (actually subtracted from) 112.5 yields 
a mean of 108.4.) 

Norms and standards. When a representative group of 
students of a particular grade or age has taken a test and the 
mean of their scores is computed as just illustrated, it is known 
as a norm. Grade norms are frequently computed by months; 
thus, if the mean score of those who have been 3 months in the 
5th grade (grade placement 5.3) is 87, students who henceforth 
make 87 on this test are s<iid to have a grade-placement of 5,3. 
Age norms are computed in a similar manner; thus, if the mean 
score for students who are 8 years 7 months old chronologically 
is 122 on a certain reading test, those who later take the same 
test and make a score of 122 are said to have a reading age of 
8,7, or reading ability equal to the average of those who are 
8 years 7 months old chronologically. Had the test been one 
of mental maturity or intelligence instead of reading, the result 
would have been interpreted as a mental age of 8.7; that is, the 
student would possess mental ability or maturity, as revealed 
by the test, equal to the average of those who are 8 years 
7 months old chronologically. 

The norm of an achievement test is not a standard to be 
reached by all classes and schools. Desirable standards may 
be set higher or lower for particular communities. A norm is a 
point of departure in interpreting test results. Individual 
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scores for members of a class on a particular test have no mean- 
ing until referred to the average. Modifications of objectives, 
materials, procedures, time allotments, and the like may take 
their point of departure from a study of the relation between 
average scores actually attained by classes and schools, and the 
published norms revealing what unselected or representative 
groups have accomplished. 

Norms are sometimes developed in percentile form; that is, 
instead of dealing with age or grade intervals, whole grades, 
or all pupils of a particular age, are utilized. Such norms 
reveal the scores made or exceeded by the best lo, 20, 50, or a 
certain per cent of pupils who are in the particular grade, or 
who have attained a particular age. Table 34 presents a sec¬ 
tion of the percentile norms for grade 13 (college freshmen), 
advanced battery of the Progressive Achievement Tests. 

The standard deviation. To know the range of scores and 
the mean score obtained by a group on a given test aids in 
understanding the nature of student ability or achievement 
represented by that group. Two groups obtaining the same 
mean and same range of scores on a given test might, however, 
still be very different. For example, in Table 33, only five 
instead of fifteen students might have obtained scores of 110- 
114, the other ten being located nearer the extremes. For 
this, and other reasons, it is often desirable to utilize a measure 
which reveals the extent to which scores are grouped around 
the mean or scattered away from it. One such informal 
measure has already been presented, the Q. Another more 
useful measure is the standard deviation, usually indicated as 
S.D., or O', the symbol for the Greek letter sigma, and so pro¬ 
nounced. 

The standard deviation may be defined, informally, as the 
distance (usually in terms of scores) in both directions from the 
mean necessary to include roughly two-thirds of the measure¬ 
ments, cases, or students. (Technically, this represents 
68.26% of the distribution.) 



Table 34. Percentile Norms for Grade 13 (College Freshmen) 
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Thus below Table 33, the a of the distribution Is revealed 
as 11.9, or about 12 score (I.Q.) points; it indicates that the 
range of scores between 108.4 “ 12 or 96, and 108.4 + 12 or 
120, will include roughly two-thirds of the frequencies. This 
relation is exact only when the distribution is normal. The a 
of a group is usually interpreted as a measure of its homo¬ 
geneity with respect to the trait or ability measured. Thus, a 
group having a cr of 10 I.Q. points would be more homogeneous 
with respect to I.Q.’s represented than a group with a o- of 
20 I.Q. points. 

The <r is usually computed from the assumed or guessed 
mean, and the work involved in computing the mean is used in 
obtaining the a. Thus, referring again to Table 33, it is neces¬ 
sary to add only the fd^ column (no. 5) before substituting in 
the tr formula. This column is obtained by multiplying the 
quantities in columns 3 and 4. Note that there are no negative 
quantities in column 5. 

The formula for obtaining the o- is also given below Table 33. 
The steps in the computation are as follows: 

1. Add thefd? column. (349, the 

2. Divide this sum by N. (349 -f- 55 = 6.35) 

3. Divide the algebraic sum of column 4 (— 45 or S/d) by N and 
square it. (.65) 

4. Subtract the sum of operation 3 from operation 2. (5.7) 

5. Extract the square of this difference. (2.38) 

6 . Change this quantity to score form by multiplying by the size of 

the class interval, 5. (11.9) 

The normal distribution. A knowledge of the standard devi¬ 
ation makes it possible to take advantage of various applica¬ 
tions of the normal-distribution hypothesis, known variously 
as the normal curve, the curve of error, the biological curve, 
and so on. The theoretically correct curve is mathematical, 
and may be obtained by expanding a binomial, like (a + 
or to some other power. The normal-distribution hypothesis 



362 APPRAISING THE EPUCATIONAE PROGRAM 

as applied to education is simply a statement of how traits, 
qualities, ability, and achievement are expected, theoretically, 
to be distributed. Such normal distributions are never ob¬ 
tained except by chance, but the distributions actually ob¬ 
tained usually resemble the theoretically normal distribution 
more than any other. By utilizing the mathematical proper¬ 
ties of the normal distribution it is possible, among other 
things, to determine the difficulty of test items and to scale tests 
and develop product scales; to determine the extent to which a 
particular group of scores represents a normal distribution and 
may therefore serve as the basis for drawing valid conclusions; 
to make deductions concerning the discrimination of a test 
when it has been applied to a presumably unselected group; to 
determine the reliability of means and differences between 
means and other types of statistical computations; and to 
transmute the scores of tests with differing numbers of test 
items into more comparable units. 

Figure 63 represents a normal distribution of I.Q.’s when the 
mean is 100. 

Figure 63 illustrates the mathematical properties of the 
normal distribution in terms of whole standard deviations. 
The (T of a normal distribution of I.Q.’s with a mean of 100 is 
about i 61 .Q. points. Figure 63 reveals that after 3 have 



FIG, 63. NORMAL DISTRIBUTION OF SCORES (l.Q.’s) 
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been measured in either direction from the mean of a normal 
distribution, over 99 per cent of all cases or measures have 
been included. The remaining .26 of one per cent only lies 
outside these limits, .13 of one per cent, theoretically, at each 
end. Important characteristics of the normal distribution are 
as follows: 

I, The first S.D.’s in either direction from the mean of a normal dis- 
tribution each include 34-13 per cent of the distribution of cases 
(as might be estimated from the areas) included under the curve 
when perpendiculars are erected at the i S.D. points above and 
below the mean, or at the 4 - i S.D. and — i S.D. points. 

3. Succeeding S.D.’s, however, include much smaller per cents of the 
distribution of cases, the second S.D.’s including 13.59 per cent 
each and the third S.D.'s only 2.15 per cent. S.D.’s are equal, 
therefore, in score units or points, but these include different per 
cents of the total number of cases or measurements of a distribu¬ 
tion when perpendiculars are erected at the points which separate 
them. 

3. The per cent of the scores included between the mean and a 
perpendicular erected at any S.D. or fractional S.D. point from 
the mean is always constant; that is, perpendiculars erected at 
.6 of an S.D. or 1.7 S.D. from the mean will always include the 
same per cent of the normal distribution between these points 
and the mean. 

4. From point “3 " just presented, it follows that, given the per cent 
of cases between the mean and a perpendicular erected at a point 
some distance from the mean, we should be able to find this dis¬ 
tance in terms of S.D. units; and given the distance between the 
mean and a point located some distance from it in terms of S.D. 
units, we should be able to tell the per cent of the distribution 
which lies between the mean and a perpendicular erected at the 
given point. In order to translate per cent of this distribution 
into S.D. units and vice versa, it is necessary to work with B-a S.D. 
table, which gives the equivalent of each in terms of the other, 
(See Table 36.) 

The reliability of a mean. Time does not permit a teacher 
to test all that has been taught or all that pupils know about 
any one luixt of work. Instead, the teacher must be content 
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to evaluate through a sampling of what has been learned. As 
a result^ if additional tests are given, the means of these tests 
will tend to vary, some of them being higher and some of them 
lower. If she wishes, therefore, to obtain a more accurate 
measurement, she may give twenty-five or thirty (or an in¬ 
finite number) similar tests, computing a mean for each. 
Furthermore, she may tabulate these twenty-five or thirty 
means into a frequency-distribution and compute its standard 
deviation. This would indicate in a rough way within what 
limits additional means would probably vary. However, this 
would be a long, arduous task, and a mathematical short cut 
to approximately the same result has been developed. If the 
standard deviation and mean of a distribution are known, the 
reliability of the mean (usually indicated by its standard error, 
and identified by the Greek letter epsilon, e) may be found by 
the following formula: ^ 

0* 

£ ass 

" Vn 

The data of Table 33 reveal that the S.D. of the distribution 
is 11.9. Substituting in the above formula and dividing ri.9 
by the square root of 55 (the number of measurements) yields 
a standard error of i .6. Standard-error units are interpreted 
as S.D. units, as illustrated in Figure 64. 

Since 68.26 per cent (roughly two-thirds) of an infinite num¬ 
ber of additional means would theoretically fall between 106.8 
and no (108.4 d= 1.6), w'e may infer that the chances are 2 to 
I, or 2 out of 3, that the true mean lies within these limits. 
Since 95.44 per cent of an infinite number of additional means 
would theoretically fall between the limits 105.2 and 111.6 
(108.4 dh 2 X 1.6), we may infer that the chances are about 
95 out of 100, 19 out of 20, or 19 to I that the true mean lies 

‘ Sec Yule, G. W. Ah IntrodHStim to the Thmry of Slatlsticst ninth edition. London, 
Charlen Griffin and Company, loap. 

Note.' Whm the aumher of emm is gmnil me is? — 1 rather than N in the formula. 



NATURE OF VALID MEASUREMENT 


365 



Means 103.6 105.2 106.8 108.4 110 111.6 113.2 

«m’s -2ern ~l«m Mean of +lem +2en, +3€ni 

Means 

FIG. 64. NORMAL DISTRIBUTION OF MEANS 

within these limits. And it is practically certain that addi¬ 
tional means would fall within 36^ units from the obtained 
mean of 108.4, or between 103.6 and 113.2, because this range 
includes 99-74 per cent of a normal distribution. This knowl¬ 
edge about an obtained mean makes possible a more rational 
interpretation of its significance and its relation to a test norm, 
or to other similar means. The determination of the reliability 
of a mean (or the reliability of the difference between two 
means) rests upon the assumption that if an infinite number of 
means were computed, they would themselves constitute a 
normal distribution of means. 

The reliability of a difference between two means. Two or 
more classes doing the same work rarely obtain identical mean 
scores on the same test, no matter how valid the test or similar 
the groups. When one seventh grade obtains a score of 117 
on an arithmetic test while a similar grade obtains 125, admin¬ 
istrators and teachers often infer from these scores that one 
class is definitely superior to the other. The significance of 
this difference cannot be determined from an inspection of the 
scores themselves; it depends upon the reliability of this ob¬ 
tained difference of 8, which in turn depends upon the reliabili¬ 
ties of the two means, 117 and 125. 

If each daes were tested an infinite number of times, means 
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computed for each, and differences between the means noted, 
they would vary above and below the obtained difference of 8. 
The results might be recorded as in Table 35. 


Table 35. Differences Between Means 


Tests 

Class A 
Means 

Class B 
Mkahs 

OlFyKRRNCE 

Bktwebn Means 

Ct) 

(A 

(J) 

(4) 

I 

117 

125 

(Obtained mean) + 8 

3 

124 

122 

— 2 

3 

120 

120 

0 

4 

II6 

II8 

+ 2 

5 

120 

126 

-j- 6 

6 

120 

130 

-1- 10 

7 

II7 

129 

-i-12 

8 

130 

124 

— 6 

9 

125 

I 2 I 

— 4 

10 

N tests 

120 

124 

■h 4 


If an infinite number of such differences as are recorded in 
column 4 (Table 35) were obtained and tabulated they would 
themselves theoretically constitute a normal distribution of 
differences, and their standard deviation might be computed. 

However, no such laborious i>rocedure is necessary; a short 
cut has also been devised for use here and may be expressed 
In the following formula: * 

This formula indicates that the standard error (e) of a dif¬ 
ference between two means is equal to the square root of the 
sum of the squares of the standard errors of their respective 
means. 

To illustrate the use of this formula, assume the following 
conditions: Two seventh grades each containing 49 students 
have taken the above-mentioned arithmetic test, Class A ob- 

»This fonnuta Is used wJwn no corrrfation exist* between the neries; when cortelaUoa 
editiB, the term, — ar#**, *Ma. must toe adtlctl under tlie mdfaii. 
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taining a mean of 117, and Class B, a mean of 125. Assume 
that the standard deviations of the original distributions of 
student scores were computed and found to be 20 for Class A 
and 17.7 for Class B. The question to be answered is, “Is 
there a significant difference between the classes? ” 

c 

Using the formula, = —p=r, the standard error of mean 

ViV 

117 is found to be 2.86; and the standard error of mean 125, 
2.53- Squaring each, adding these squares, and extracting the 
square root, yields a of 3.83 for the obtained difference of 8. 

The standard error of a difference is also interpreted like a 
standard deviation. In fact, knowing the difference between 
two means, and its standard error, we can construct the range 
and frequency of differences which an infinite number of tests 
would theoretically produce, as illustrated in Figure 65. 

Figure 65 reveals that the chances are 2 out of 3 or 2 to 1 
that the true difference between the class means lies between 
4.17 and 11.83; that the chances are 19 out of 20 or 19 to i that 
the true difference lies between .34 and 15.66. But the area 
below zero also reveals that the difference may shrink to less 
than zero; that is, it may be reversed in favor of Class A, 
hence we are not justified in contending that Class B is defi- 


Zero , 

1 


34.13% 1 


2.15?S 

fl^9?5 



^S^2.15% 

13.59% 


Differences -3.49 .34 4.17 8 11.83 15.66 19.49 

Cp's -3ep -2eo '-1^0 Mean of +leD +2 «d +3C[, 

Differences 

FIG. 65. NORMAL, DISTRIBUTION OF DIFFERENCES BETWEEN 

MEANS 
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nitely superior to Class A. Accurate tables have been devel¬ 
oped for reading the probabilities of a true difference above 
zero for fractional standard-error differences, but no such ac¬ 
curacy is needed for ordinary classroom testing, A simple 
method for testing the significance of the difference between 
two means is to divide the obtained difference by the Cq, If 
the quotient is 3 or more we may infer that a significant differ¬ 
ence exists; the larger the quotient above 3, the more signifi¬ 
cant this difference. In the present instance 8 'i- 3.83 yields 
2.09, hence we assume that no significant difference exists. To 
be significant the obtained difference between the means would 
have had to be at least 3 times 5.83 or 11.49. The use of the 
number 3, of course, is due to the fact that a normal distribu¬ 
tion is practically exhausted at three standard deviations or 
three standard-error units from the mean. 

Standard units and comparable scores. If two tests, A and 
B, contain 75 and 140 items, respectively, the scores obtained 
from them cannot be compared directly. A score of 75 on 
test A would be perfect; a similar score on test B would be 
interpreted in terms of the mean score, or one obtained from 
an unselected group. Some achievement tests use grade- 
placements, or subject-age.s in direct or percentile form, which 
are familiar concepts and easily understood by teachers. 
Acquaintance with the mathematical properties of the normal 
distribution, and a knowledge of how scores of valid tests are 
usually distributed, also makes it possible to transmute these 
scores into standard-deviation or distance units. These have 
been utilized also by other test-makers. To use this tech¬ 
nique it is necessary to utilize not only the total S.D.’s illus¬ 
trated in Figure 63, but also fractional parts of these S.DJs as 
well, as presented in Table 36. 

Table 36 may be interpreted in connection with Figure 63. 
Opposite 1.00 in column 1 (Table 36) and opposite 2.00 and 
3.00 in column 3 will be found the per cents corresponding to 
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Table 36. Per Cent of the Cases of a Normal Distribution 
Which Lie Between the Mean and Various S.D. Unit 
Distances from the Mean 


S.D. Units 

Per Cent 

OF Cases 

S.D. Units 

Per Cent 

OF Cases 

(1) 

(2) 

( 3 ) 

(4) 

.00 

0.00 

1.50 

43-32 

.05 

1.99 

1.55 

43-94 

.10 

3-98 

1.60 

44-52 

•15 

5-96 

1.65 

45-05 

.20 

7.93 

1.70 

45.54 

.25 

9.87 

1-75 

45.99 

■30 

11.79 

1.80 

46.41 

-35 

13.68 

1.85 

46.78 

.40 

15-54 

1.90 

47-13 

•45 

17.36 

1.95 

47-44 

■50 

19-15 

2.00 

47.72 

.55 

20.88 

2.05 

47.98 

.60 

22.57 

2.10 

48.21 

.65 

24.22 

2.15 

48.42 

.70 

25.80 

2.20 

48.61 

•75 

27-34 

2.25 

48.78 

.80 

28.81 

2.30 

48.93 

•85 

30.23 

2.35 

49.06 

.go 

31-59 

2.40 

49.18 

•95 

32.90 

2.45 

49.29 

1.00 

34-13 

2.50 

49-38 

1.05 ■ 

35-31 

2.55 

49.46 

1.10 

36.43 

2.60 

49.53 

1.15 

37.49 

2.65 

49.60 

1.20 

38.49 

2.70 

49.65 

1.25 

39.44 

2.75 

49.70 

1.30 

40.32 

2.80 

49.74 

1-35 

41.15 

2.85 

49-78 

1.40 

41.92 

2.90 

49.81 

1.45 

42.65 

2.95 

3.00 

49.84 

49.87 


whole S.D.'s in either direction from the mean in Figure 63; 
2.00, of course, includes both 34.13 and 13.59; and 3.00 adds 
2.15. Table 36 values include the total per cents from the mean 
in each case. 

Figure 63 reveals that 34.13 per cent of a normal distribution 
of I.Q.’s (mean 100) will be located between 100 and 116. 
However, if the per cent of I.Q/s located between 100 and 108, 
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for example, is desired, it is necessary to transform the score 
distance (8 points) into an S,D. unit. Since a total S.D. in 
this distribution is i6, 8 points would constitute .5 S.D. Ref¬ 
erence to Table 36, column r, reveals that 19.15 per cent of a 
normal distribution lies between the mean and a point ,5 S.D, 
from the mean (in either direction). If it is desired to know 
what per cent of the LQds lies between any two points on oppo¬ 
site sides of the mean, the per cents corresponding to the S,D, 
distances are found and added. 

However, if the tw’o points lie on the same side of the mean, 
the computation is different. For example, to find what 
per cent of the I.Q.’s of this distribution lies between 120 and 
132, the computation is as follows: 

1. Score 120 is 20 score points or 1.25 S.D* units from the mean, and 
marks off (referring to Table 36) 39.44 per cent of the distribution. 

2. Score 132 is 32 score iK)ints or 2 S*D* units from the mean, and 
marks off (referring to Table 36) 47.72 per cent of the distribution. 

3. 47.72% — 39.44% » 8.28 jier cent of the distribution, the proba¬ 
ble number of LQ.’s between 120 and 132. 

If it ia desired, on the other hand, to know the score or S.D. 
limits of any part icukir per cent of a normal distribution, these 
may be found by reversing the use of the table. For example, 
it is desired to know' the highest and lowest probable scores 
made by the 45 per cent of pupils located on either side of the 
mean. Column 4, Table 36, reveals that the 45 per cent limit 
from the mean is approximately 1.65 S.D. units from the mean. 
An S.D. unit is 16 score points, therefore the distance in score 
points is 1.65 X 16, or approximately 26 score points from the 
mean. The mean is 100, therefore the most probable highest 
and lowest scores are approximately 126 and 74, respectively, 
in terms of I.Q. 
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QUESTIONS AND EXERCISES 

1. Outline some of the most serious objections to measurement. 
Write your answers to these objections. 

2. Indicate your reasons, if any, for desiring to utilize standard tests. 

3. Describe and evaluate the outline of major types of tests. What 
modifications do you recommend? 

4. How are test scores tabulated? 

5. Describe the computation of the Mi and Q. 

6. Make a list of the difficulties you experience in understanding the 
computation of the mean, and take them to class. See that you 
clear them up. 

7. Indicate the relationship between the computation of means and 
a discussion of norms and standards. 

8. Make a list of the exact points of difficulty in understanding the 
S.D. Hand this list to your instructor to explain in class. Do 
not allow yourself to go on without understanding the S,D. 

9. What is the normal distribution? How is the theoretically correct 
distribution derived? Memorize the per cent of the distribution 
included in each of the first three S,D's. 

10. Explain why means are not entirely reliable and how the degree of 
reliability is determined. 

11. Two classes, X and Y, take the same test. Class X makes a mean 
score of 105; the Class Y mean is iii. The S.D. of the scores of 
Class X is 21; the Class Y S.D. is 20. Forty-nine students belong 
in Class X and 64 in Class Y. Compute and explain the for 
each class. 

12. Compute and explain the for the two classes in question ii 
and draw the curve of differences which might be expected from an 
infinite number of similar tests. 

13. Explain the nature of the S.D. (also e) table. Make a list of your 
difficulties and see that they are cleared up before the class goes on 
to other work. 
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. CHAPTER FIFTEEN • 


NATURE OF VALID MEASUREMENT: 
STANDARDIZING AND 
INTERPRETING TESTS 

Some of the fundamental characteristics of valid tests as well 
as the related elementary statistical computations are now 
familiar to the student. The purpose of this chapter is to 
describe certain more advanced statistical processes which are 
useful in studying and selecting tests already available and in 
validating new tests. 

Scaling individual test items. The standard-deviation 
table described in the last chapter may be used in scaling or 
obtaining S.D. values for individual items of tests. Such S.D. 
values, once obtained, are more universal in nature, hence 
more comparable, than the original arbitrary or raw scores. 
The computation is illustrated in Figure 66. 

The test is first given to as normal (unselected) a group as 


Item K-%% Item B- 22 . 6 G?S Item C-6.68JS 



S.D, of Dis, — 3<r’ -Z<r — Itf Mean +2tf +3«r 

S.D. locations of Items -IJStr +.75<r-+1.5(JV' 

ItemS.D.’s redistributed 0<r 3.75 

FIG. 66. ILLUSTRATION OF PROCESS OF SCALING TEST ITEMS 
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is available.* The per cent of the group which succeeded with 
each item is next determined. In Figure 66, test item C was 
so difficult that only 6.68 per cent of the group succeeded with 
it; 22.66 per cent succeeded on item B; and 96 per cent suc¬ 
ceeded on item A, 

These per cent limits are next transmuted into S,D, units 
(distances from the mean) in the following manner: 

1. Since 6.68 per cent succeeded on item C, 43.32 per cent of those 
above the mean failed. This point, reference to Table 36 reveals, 
lies 1.50 S.P. units above the mean. 

2. Between the mean and the 22.66 per cent who succeeded on item 
B, lie 27.34 cent who failed. The point of separation, or lo¬ 
cation of item B, is therefore .75 S.D. from the mean, 

3. Ninety-six per cent succeeded on item .A., 46 per cent of whom 
are located below the mean. This 46 per cent marks a point 
r.75 below the mean and locates item A, 

In this same manner, one, two, or five hundred test items 
may be scaled in terms of their S.D. unit distances from the 
mean. Since these S.D, locations of test items are computed 
in terms of the succeswi of students, they may be said to in¬ 
dicate the relative difficulties of such test items, and constitute 
a scale of S.D, units. However, since all computations are 
made with reference to the mean, the 2ero point of the scale 
is in the middle instead of at the end. The difficulty may be 
largely eliminated by arbitrarily moving the zero point down 
3 S.D. units ^ and redistributing the scores, as indicated in 
Figure 66. Test items located .1 .2 S.D., .5 S.D.^ or 

some other S.D. distance apart may then be selected, and a 
scale of items differing by approximately equal increments of 
difficulty constructed. 

ScaUng total test scores — the T-score. There are certain 
problems in utilizing ordinary scales in which the difficulty 

* Normality oi data may be determined statistically. 

» A normal diatrlbution is practically exhausted at a S.D.*8 from the mean. However, 
the aero may be moved further below the mean, depending upon the point aelected to repre¬ 
sent iu«t no ability, achievement, or other evidence of the trait or quality in question. 
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of each individual item has been determined. First of all, 
pupils do not succeed up to a given scale item, and then fail 
on the remainder. Instead, when approaching the limits of 
their abilities, they may fail on some items which are easier 
for pupils in general and succeed on items later in the scale 
which are more difficult for pupils in general, thus creating a 
scoring problem. Again, scales have been produced with the 
zero points anywhere from 2.50 S.D.'s to 5.00 S.D.'s below the 
mean of the test, and scores obtained with different zero 
points are not comparable. Finally, the difficulty values of 
certain individual test items have been found to vary con¬ 
siderably when scales have been applied to different groups. 

Scaling the total test scores obtained by pupils (instead of 
scaling each test item, as illustrated in Figure 66) makes it 
possible to avoid certain of the difficulties presented. This 
approach was first utilized by McCall" in his development of 
the T-score. Scaled scores based on total scores are more 
stable than the difficulty values of individual test items. The 
same zero point (five S.D.*s below the mean) may be utilized, 
thus making the scores of different tests more directly com¬ 
parable. The process of scaling is very similar to that il¬ 
lustrated in Figure 66, except that instead of using the per 
cent of pupils who passed each test item as the point of departure, 
each T-score “ is computed from the per cent of pupils exceeding 
a given total score plus one-half of those attaining it. 

Figure 67 illustrates the relation of the T-score to the normal 
distribution. A designates a normal distribution with a 10 
S.D. baseline. B represents the same distribution with the 
zero point moved to —5 S.D. C indicates the total possible 
distribution of T-scores (0—100); in practice, T-scores vary 

' McCall, W. A. How to Measure in Education. New York, The Macmillan Company, 

* T-acorea were originally baaed by McCall on distributions of unaelected iz-year-olds. 
However, the same procedure la now used with other age groups and mixed age groups, 
hence the term T-acore -will be retained to indicate the procedure. Such scores are also 
known simply as scaled scores. 
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FIG. 67. RELATION BETWEEN NORMAL DISTRlBtJTION 
AND T-SCORES 

between 15 and 85. Since there are ten T-scores for each S.D., 
each succeeding T-score represents an increment of .1 S,D. in 
difficulty value, or success. 

In computing T-scores for tests, scaling total test scores, the 
following steps may be utilized: * 

I. Give the entire test to as normal a sampling as possible of the 
population for whom the test is intendc<l. 

Make a frequency table showing the number of sulijects (column 
2) obtaining each total score (column i). 

3,. Add a third column to the frequency table and compute and 
enter opposite each total score in column i the number of pupils 
exceeding that score, plus one-half of those aitatning it. 

4. Express each entry in column 3 (no. 3 alx>ve) in jjer cent form, in¬ 
dicating what {>er cent of the entire group of subjects each entry 
represents, 

5. Exprm each per cent in ajiumn 4 (no. 4 above) in terras of S.D„ 
utilizing an S-D. table, as illustrated in Table 36, 

6. Multiply each S.D. value by 10, and subtract the result from or 
add it to 50 (mean T-score), depending upon whether or i\ot the 
S,D, value is negative or positive. 

Product scales. Product scales may also be developed 
through a special application of the scaling principle. English 
composition, writing, sewing, and the like are not usually 
entirely good or bad, or right or wrong; instead, these products 

* See Gurrelt, H. E. Staffsiks in fsychohgy and Bdueation, second cdiilon. New York, 
Lzmzman*, Gn»ttfljnd Company, sdw Bulletin ao. Educational Records Bureau, June, 

1937 . 
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tend to differ in merit from very poor to superior or nearly 
perfect. Individual estimates of merit vary widely, and prod¬ 
uct scales are developed to aid in evaluation. 

The following is one procedure which may be used in de¬ 
veloping a product scale: 

1. A wide range of samples of the product to be scaled is first ob¬ 
tained. 

2. These samples may be judged and thrown into as many groups 
as there are to be steps in the scale. 

3. Keep track of the judgments of succeeding judges, and eliminate 
samples of the product which change from one step to another. 

4. Select typical samples from those left for each step of the scale. 

5. Have judges rank these typical samples in order of merit. 

6. Determine what per cent of the judges rank each sample above 
every other sample. 

7. Transform these per cent differences into S.D. differences. 

8. Average the various S.D. differences between samples to obtain 
final S.D. differences between samples. 

9. Note the total range of S.D. differences between samples; select 
samples .5 S.D., .75 S.D., or in terms of some other S.D. differ¬ 
ence in merit for the final scale. 

10. These S.D. units may be used directly as scores, or other numbers 
preserving the relationship between samples may be assigned. 

Many scales of merit in handwriting now in use were de¬ 
rived in a different manner. The Ayers scale was used as a 
point of departure and samples equal to the steps on the 
Ayers scale derived through rating. 

The P.E. of a mean or of a difference. Thus far we have 
utilized only standard errors in dealing with means and dif¬ 
ferences between means in applications of the scaling principle. 
The P.E. or probable error may also be used. It may be 
defined as .6745 of the standard error. The formula is as 
follows; 

P.E.m = . 674 - 5 €m 

Being smaller than the standard error, it takes 4 P.E.’s in 
either direction from the mean of a distribution of means or 
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differences to complete, practically, a normal distribution. 
The limits of means and differences must also be interpreted 
diflFerently. For example, the first P.E. in either direction 
from the mean includes the middle 50 per cent of the distribu¬ 
tion instead of 68.26 per cent as in the case of the and 
instead of saying chat the chances are 2 out of 3, or 2 to i, we 
can only say that the chances are 50-50 or i to i that a given 
possibility is true. 

The use of P.E.’s requires a P.E. table similar to the S.D, 
(e) table presented above. But P.E.’s must be derived from 
e’s, hence additional computations are necessary and addi¬ 
tional opportunity for the introduction of errors is provided. 
The writer recommends the use of the e rather than the P.E. 
as a measure of reliability. 

The reliability of tests. So far we have considered the re¬ 
liability of obtained means and differences between obtained 
means on testa. The reliability of a test as a whole or of a 
part of a test is another problem and is determined in a dif¬ 
ferent manner. 

Reliability data are usually published for valid tests. For 
example, Table 37 pressenta such data from the manual of the 
California Tests of Mental Maturity, Advanced Battery, 


Table 37. Reliability Coefficients, Advanced Battery, 
California Tests of Mental Maturity 


-! 

It»ms 

j f rRAin "i 

7 

S 

© 

10 

11 

13 

7 -ia 


.920 

.936 

.949 

-950 

.946 

-970 

.962 

Language factors....... 

.923 

.879 

•913 

•934 

•933 

-944 

•945 

Non-language factors.. - 

.883 

.912 

.931 

.949 

-891 

.917 

•935 

Test A. Memory-...... 

.886 

.930 

.920 

.959 

.885 

.914 

.925 

Test B. Spacial relations 

-839 

,825 

.871 

.879 

•903 

.905 

.88g 

Test C. Reasoning.. — 

.874 

,864 

.902 

,906 

•907 

.934 


Test D, Vocabulary.... 


.89.*) 

,886 

.897 

•915 

-927 


No. of pupils. 



100 

'■Ml 

too 

100 


S.D. (M.A. in Mo.). ... 



17.5 

18.0 

20.0 

20.0 

25.0 
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The above reliability coefficients are computed for single 
grade ranges and the total range from grades 7—12. However, 
the range of the test extends to grade 14. The last column of 
coefficients of Table 37 would be increased somewhat if the 
total range had been used. 

The reliability of a test has previously been described as 
the extent to which it is consistent, or the consistency with 
which it measures whatever it measures. Reliability is de¬ 
termined through correlation; and the magnitude of the re¬ 
liability, by the size of the correlation coefficient, usually 
designated by r, in conjunction with certain other factors to 
be described later. 

Three methods of determining the reliability of a test are 
commonly used. 

1. Retest method: The same test is given a second time (s-fter a 
short period has elapsed) and the correlation between the two 
results computed. 

2. Two-form method: Several forms of the same test are frequently 
developed. Two different forms may be given and the correlation 
between them computed, 

3. Split-halves method: A test may be broken into two parts (odd 
and even items) and the correlation between the two sets of 
scores computed. This method requires a correction which will 
be discussed later. 

Computing a coefficient of correlation. The computation of 
a coefficient of correlation is illustrated in Table 38. The data 
constitute 600 split-half scores on the California Test of 
Mental Maturity, Advanced Battery. Across the top of the 
table is one set of scores (18—21, 22-25, etc.) and at the left 
another set (26—29, 30-33, etc.). Since the score of each sub¬ 
ject was split into two parts, both of these halves are tabulated 
simultaneously. For example, a student makes 47 on one half 
and 32 on the other half. The tabulator finds on the left the 
cell which would include 47 (the interval 46-49); moving to¬ 
ward the right he locates in the intervals above the one which 
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would include 32 (interval 30-33 at the top); one entry suf¬ 
fices, therefore, to tabulate both halves of the score. There 
are three entries in this particular ceil. Ignore the circled 
numbers for the present. 

These entries are next counted horii^oiitany and entered in 
the/column at the right; and they are counted vertically and 
entered in the / column below. These now constitute two 
frequency tables. 

The /, X, fxt and fx^ columns at the right are the data, al¬ 
ready familiar, utilized in computing the S,D.; the same is true 
of the ft y, />', and /y* lines at the bottom of the table which 
are utilized for finding the S.D. distril>ution whose class in¬ 
tervals are at the top of the table, and whose frequencies ap¬ 
pear as the / column at the bottom of the table. 

There remains but the method of completing thefx and fxy 
values at the bottom of the table before substituting in the 
formula for r (coefficient of correlation by the product-moment 
method). The S.Ddn of l)oth distributions were computed 
from assumed means locattxl at the midpoints of class in¬ 
tervals; these intervals have lx?en indicated by double lines. 
The cell containing o and 29 is at the assumed midpoints of 
both distributions. The values in the circles or fx's are ob¬ 
tained by multiplying the frequencies (originally tabulated) 
by their deviations from the assumed midpoint of the dis¬ 
tribution at the right and indicaU*d by the double horizontal 
lines. Thus the cell containing the frequency on the 26-29 
and i 8“21 intervals is 8 deviations below this point and is 
therefore labeled ~8; the cell on the 70-73 and fib-fig intervals 
(containing 15 frequencies) is 3 deviations above the point 
of reference, and Itsfx value becomes 45. The vertical columns 
are now added algebraically and entered in the horizontal 
lines of cells to the right o(fx. Each fx value is now multiplied 
by the corresponding value on the y line, and the products 
entered in the fxy line. The fx values take account of the 
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deviation of all scores above and below the assumed midpoint 
of the 58—61 interval at the left; the y values take account of 
the deviations of all scores above and below the assumed 
midpoint of the 50-53 class interval at the top. Thus th.ei fxy 
column entries represent the products of the frequencies times 
their deviations (plus or minus) from the two assumed means, 
thus accounting for the “product-moment” designation of 
this method. The S/ac and S/xy are computed and entered at 
the lower right. We are now ready for substitution in the 
formula for computing the coefficient of correlation, which is 
as follows: 


r = 


X/xy- 


(X/x) (X/y) 
N 




4019 - 


379 X 602 
600 


5413 - 


(379)^ 

600 


5798 - 


(602)= 

600 


= .876 


The above method may be used to compute f regardless of 
the method selected to determine reliability. When r is com¬ 
puted by the split-halves method, the test is, in effect, reduced 
to one-half its original length and the r obtained is lower than 
the true r. It may be corrected by utilizing the Spearman- 
Brown correction formula, which in the case of split-halves 
may bie simplified as follows: 


2 


Ti = 


I + r. 
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where is the obtained r for the reduced test, and T^ is the 
coefficient of reliability for the test as a whole. Substituting 
in this formula, 

2 X .876 
1 + ,876 

** -984 

When the S,D, of the original distribution of scores (before 
they were split into halves) was computed it was found to be 
14.60. The normal S.D. for this population (previously dis¬ 
covered from extensive data) is approximately 16.00. This 
r of .po7, therefore, represents a reliability based on a somewhat 
selected group, hence the reliability of the test for a normal 
range may be obtained by another formula,* 



when is the 5 .Z>. of the scores of the narrower range, 

S,D.^ is the S,D, of the wider range, is the coefficient of 
reliability for the wider range, and r„, the coefficient of re¬ 
liability for the narrower range. Substituting, 

14 .^ ^ vr^~' 7 j'_ 

ib.tK) s/ j — .'Q‘^4 

“ -945 

The standard error of r (or P. is usually given with f. 
The formula Is ^ 

Vn 
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The if is interpreted like any e unit: the chances are 2 to i 
that the true r does not deviate from .945 by more than .004; 
and the chances are practically certain that the true r lies 
between the points represented by .945 ± 3 X .004. 

Interpreting coefficients. When the coefficient of correlation 
is computed between two applications of the same test, two 
forms of the same test, or two halves of the same test, such a 
coefficient is known as the coefficient of reliability. The 
meaning of such a coefficient cannot be interpreted from its 
size alone. The following factors, among others, must be 
considered: 

1. Range of data. The same test given to two or three succeeding 
grades will yield a higher coefficient of reliability than for a single 
grade or half-grade range. When the reliability of a single grade 
range of a test is .80 to .85, a three- or four-grade range may yield 

.95. 

If, therefore, we have the reliability for a single grade range 
(and the S.D. of the scores), and know the S.D. of the scores for 
a wider range, we can compute the reliability of the test over the 
wider range by the formula already presented and utilized in cor¬ 
recting the r of -934-. 

Knowing the S.D,’s of both ranges and the reliability coeffi¬ 
cient for the wider range, it is, of course, possible with the same 
formula to determine the reliability of the test in the narrower range. 

2. The accuracy of data. The more accurate the measures correlated, 
the higher the correlation tends to be. When independent ratings 
or marks (not vitiated by frozen subjectivity) are correlated, 
they tend to yield lower coefficients. True relationships may, 
therefore, be inferred to be higher than those obtained with in¬ 
accurate measures. 

3. The number of test items. The longer the test the higher its re¬ 
liability. For this reason when the split-halves method is used 
the test is in effect reduced to half its original length and a cor¬ 
rection may be made with the Spearman-Brown formula, already 
presented and utilized in correcting the r of .876 above. 

Coefficients of reliability for whole tests covering a three- 
grade range should ordinarily be at least .90; and for single 
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grade ranges at least .So* However^ short individual sections 
of highly reliable tests particularly for single grade ranges will 
often drop below .80. 

The Spearman-Brown prophecy formula may be used to 
determine how many times a test must be lengthened or how 
many times it must lie ref>eated to attain a given reliability. 
Assume that the reliability of a test is .85. It is desired to 
raise it to .95. The formula is as follows: 

Nr, 

in which is the desired coefficient of reliability, N is the 
number of times the test must be repeated or lengthened or 
the number of similar forms of the test which must be used, 
and r, is the obtained coefficient of reliability. The computa¬ 
tion is as follows: 

.85 N .85 N 

X2 --—--- -ss --...--- 

I + .85 N - .85 .15 + .85 N 

N ^ 3,35 »*. the test must be repeated, length¬ 
ened, or comparable forms given to make it roughly 3.5 times 
its present length. 

The validity of tests. The validity of a test is the extent to 
which it measures what it purports to measure. Reliability 
has been defined as the consistency with which a test measures 
whatever it measures. In a sense, therefore, validity is the 
broader term and includes reliability, becaust* a valid test must 
nece^arily be reliable; but a reliable (consistent) test may 
not nece^arily be valid. For example, a prognosis test may 
be consistent in predicting achievement in algebra; but stu¬ 
dents obtaining high scores on the prognosis test may later do 
no better than those receiving relatively lower scores. Again, 
apparently reliable rating scales may lack validity as de¬ 
termined by outside criteria. 
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The validity of a test may be investigated in a number of 
different ways, including the following: 

1. Through the selection of test items 

2. Through the use of a criterion 

3. Through lengthening the test 

4. Through the validity of individual test items 

5. Through the index of reliability 

The validity of a test depends primarily upon the selection 
of test items. In the past, validation has consisted of studying 
courses of study and textbooks for items which appeared 
oftenest, and attempting to eliminate controversial, inconse¬ 
quential, non-essential, ambiguous, trick, and otherwise un¬ 
desirable items. This is still a desirable and necessary step in 
validation. Where judges have been used, they have dealt 
too much with materials and subject-matter, rather than with 
ideas and objectives. In the future, more attention must be 
given to other types of items; to be valid, a test must constitute 
a well-balanced sampling of all important objectives, subject- 
matter, or fact and information items, included. A test of the 
basic learning skills must not be restricted to the oftenest 
mentioned items or items selected on the basis of difficulty; a 
new principle enters because of the dependence of skills upon 
preceeding skills, and their contribution to succeeding skills. 
Careful analysis of crucial elements of the hierarchy, followed 
by experimental validation, is desirable in skill tests. 

The second appz'oach to validity is through the use of a 
criterion or standard. A test of ability to sell life insurance, 
for example, may be validated by correlating the scores with 
actual sales records over a sufficient period of time to obtain 
a reliable criterion. Sometimes a test is correlated against 
the average of a number of > similar tests. Thus Ruch and 
Stoddard ^ present the correlation of each of fourteen group 
tests of intelligence with the average of the other thirteen; 

* Ruch, G. M., and Stoddard, G. D. Tests and Measurements in High School Instruction. 
Yonkera-on-Hudson, N.Y., World Book Company, 1027. 
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Kuhlmann and Andersen correlated their intelligence tests 
with seven others. Inferences concerning validity, however, 
are somewhat uncertain. If a new instrument correlates 
to a high degree with another it is evidence of the fact that 
it is measuring the same thing, and one is valid if the other is. 
A low correlation often is interpreted as unsatisfactory, but it 
may be evidence of the fact that it is measuring something 
different and more important. Again, a high correlation 
between a test and a criterion indicates validity only if both 
are reliable. The Stanford-Binet is generally assumed to 
possess high validity, and has often been used as a criterion 
in validating intelligence tests. 

The predictive value (validity) of some tests has been 
increased by lengthening them. The validity coefficient 
(r between test and criterion) cannot be increased in the same 
proportion, however, as can the reliability coefficient by 
lengthening the test. Since this method is of interest primarily 
to the test-maker rather than to the beginning student, it will 
not be presented. 

Individual test items of certain types of tests may be 
studied to determine the extent to which they tend to be 
answ'ered correctly by the students obtaining the highest 
scores and the lowest scores. If they do not discriminate 
between the better and poorer students they may be elimi¬ 
nated, and better items substituted. 

We may make certain inferences regarding the validity of a 
test by utilizing the index of reliability. This index repre¬ 
sents the correlation betw^een scores actually obtained by a 
test and the theoretically true scores; it therefore reveals 
something about the extent to which the test is approaching 
the true measure of the trait or ability in question. 

The formula for the index * of reliability is as follows: 

« S<*«' r.jirreit, tl. ti. .StaihikM in and Edmaii»H, »®co»d edition, p, 3rp, Ne# 

V«rk. Ixmgman#, Crwsn and C’tJmpany. 1037. 
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where indicates the correlation between the obtained and 
true scores, and Tq is the obtained reliability coefficient of the 
test itself. Thus the index of reliability of a test with a re¬ 
liability coefficient of .85 may be computed as follows: 

^oco = ^.85 
= .92 

This index of reliability indicates that the obtained scores 
agree to a high degree with the theoretically correct scores; 
that is, that the test usually measures what it purports to 
measure. 

Additional possibilities of inferences regarding validity of 
certain tests are as follows: 

1. Intelligence tests. The extent to which average I.Q.'s of un¬ 
selected groups approach lOO. 

2. Intelligence tests. The extent to which I.Q.'s and M.A.’s of 
unselected groups approximate normal distributions. 

3. Most tests. The extent to which median scores of successive 
grade or age groups show progressive increases.^ 

4. Most tests. The extent to which the differences between successive 
norms exceed the standard errors of the differences.* 

The validity of most tests is difficult to establish. The 
methods used must be largely indirect because a criterion is 
so often lacking or unreliable. The methods presented are 
sufficient, however, to aid the beginner to grasp the concept 
and its significance. 

QUESTIONS AND EXERCISES 

1. Explain the process of scaling individual test items. 

2. How are total test scores scaled? 

3. Explain the development of product scales. 

4. In what different ways may the reliability of a test be determined? 

' Some Personality and Adjustment teats are calibrated on the assumption that scores 
showing normal adjustment or balance should remain more or less constant; other tests use 
derived scores which retain the same mean for different age or grade groups. 
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5. Explain what is meant by correlation. 

6. In what way may the Spearman-Brown formulas be used in validat¬ 
ing and interpreting tests? 

7. How does range of data influence correlation? 

8. Outline the various approaches presented in determining the validity 
of a teat; evaluate and add any others you desire. 
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CHAPTER SIXTEEN 


PLANNING THE MEASUREMENT 
PROGRAM 


No OTHER single activity can do more to promote better teach¬ 
ing-learning activities, supervision, and administration than 
a well-planned and functioning measurement program. In 
some schools the desirability of measurement is no longer 
questioned; in others, teachers are open-minded, but have not 
yet found them^lvea. In still other schools, fears, biases, 
inertia, and indifference are prolonging programs of educa¬ 
tional activities based on the opinions of teachers and ad¬ 
ministrators, even though the inadequacies of such evaluation 
have long been known. 

Current measurement programs. As a result, there are 
scores of schools in which no real diagnoses of learning dif¬ 
ficulties are ever made and in which no evidences of the suc¬ 
cess of the educational program are ever sought or provided 
for teachers and the community. In other communities, re¬ 
sistance to evaluation and other factors have forced optional 
programs. Teachers are told, in effect, “You may decide for 
yourselves whether you wish to guess and estimate and use 
other well-known amateurish and unreliable methods of 
evaluation, or to guide learning on the basis of the much more 
valid diagnosis and appraisal which can be obtained through 
measurement.“ In still other schools and school systems, well- 
conceived and continuous programs for diagnosis and appraisal, 
functioning in close co-operation with teachers, supervisors, 
curriculum departments and administrators, are in effect, and 
vital diagnostic and appraisal information is continuously 
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prepared and distributed so that it is used by all departments 
in improving teaching and learning. 

The record system. To provide the service just described, 
it is essential to plan a careful record system. The individual 
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record for each pupil should be cumulative, should provide 
ail important information which will be useful to teachers, 
supervisors, and administrators, and should follow the pupil 
from class to class and school to school, as he progresses. It 
should contain not only a record of the various tests taken, 
and of marks, if awarded, but also other signifiaint informa¬ 
tion concerning his interests, hobbies, problems, difficulties, 
and the like. The record system should be so planned that 
individual records need not pass through central files or 
offices (except when pupils withdraw) but may be in constant 
possession of teachers. All information concerning individual 
pupils should be reported to the appropriate offices through 
the use of class record sheets. Figure 68 illustrates the first 
page of an individual cumulative record extending over a 
period of six years. The back of this card contains provisions 
for recording detailed data concerning such factors as extra¬ 
curricular activities, clubs and offices, vocational experiences 
and plans, interests, physical and mental health, and per¬ 
sonality ratings and measurements. 

Figure 69 is part of a class record sheet which should be 
used for starling vital information on its way. The record 
sheet should be planned, and sufficient duplicate copies should 
be made to insure its maximum usefulness. In addition, sup¬ 
plementary class record sheets for recording specific difficulties 
in reading, music, social studies, aiKl the like are sometimes 
prepared for the use of curriculum departments, supervisors, 
and administrators. 

Selecthig tests. The basic principle in selecting tests is that 
of agreement between objectives of the curriculum and meas¬ 
urement items of the test. It is not so essential that test items 
appear in exact order or at the exact grade level of the curric¬ 
ulum; norms can easily be modified to allow for differences in 
grade placement of objectives. What is important, however, 
is that a test contain a well-balanced sampling of the moat 
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important objectives of a subject, unit, or grade experience. 

There are other important factors in selecting tests which 
influence their usefulness. For example, there are features 
like the following to aid in obtaining maximum usefulness 
from test results: 

A- A graphic profile of the strengths and weaknesses of individual 
pupils. By following simple directions which require no knowl¬ 
edge of statistics or mathematics, the teacher should be able to 
obtain a complete picture of the rajistered and unmastered tools 
of laming, strong and weak factors in ability, and so forth. 

B, A diagnostic analysis for guiding instruclion. With a minimum 
of time and energy on the part of the teacher, a test should be so 
organized that the w’eakness of pupils doing unsatisfactory work 
may be checked and uiilizeti w'ithout the Ixxjkkeeping which is 
typical of the old type survey test. 

C- A grade-placement or snental age scale in combination with a 
graphic profile, so that the grade placement or mental age of 
each pupil may lie easily and quickly determintxl directly from 
the score without the necessity of any further computations. 

D. Provision for easy administration. Many teachem have had 
little training in evaluating or testing pupil ability or the out¬ 
comes of instruction. It is imiiortant, therefore, to select tests 
of such a nature and with such accompanying instructions that 
these untraintri teachers may olitain the maximum return with 
the least expenditure* of time anti energy. Such a test must be 
easy to give, to score, and to interpret. 

Still other factors are related to tentative test selection. 
Before the detailed study outlined above is begun, the tests 
available may be evaluated in accordance with the suggestions 
of the test evaluation plan presented in Table 39, 

When the information for several available tests is recorded 
as suggested in Table 39, other features, such as graphic pro¬ 
files and provisions for diagnostic analysis, may be considered. 
Test selection should be a joint activity in which all teachers 
participate, but final decisions should be left to those trained 
in measurement. 
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Table 39 . Plan for Evaluating Tests 
Evaluator.Date. 

1. Name of test. 

2. Author or authors... 

3. Publishers. 

4.. Date of publication. 

5. Price of sample copies. 

6 . Price for class use...... 


(Use other side for remarks and refer to items by number) 


Items 

DHTAn-ED Data 

ON Each Item 

Exact Source 

OF Information: 
Manuals, 

Magazine Articles, 
Textbooks, etc. 

7. Purposes for which test is in¬ 

tended. 

8. Intended for what grades . . . 

9. Kind of norms. 

a. Age. 

h. Grade. 

c. Percentile.... 

d. Other. 

10. Reliability. 

a. Give full information for 

each coefficient cited. 

11. Validity. 

a. Give information as far as 

possible for each coefficient. 
&. Give information regarding 
sources of items and other 
methods of validation.... , 

12. Forms of test.. 

0. Number. 

h. Relationship. 

13. Time required to give test. . . 

14. Additional information. 

What do those who have used 
the test say about it?. 




The testing schedule. In order that testing may be handled 
most expeditiously and with the most useful results, school 
systems have found it advisable to have a testing schedule. 
Planned in co-operation with teachers and principals, it usually 
consists of some or all of the following details: 

1. Dates for meetings of examiners 

2. List of tests to be given and dates for giving each 
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3. Dates for completion of scoring of tests 

4. Date when class record sheets and other reports are due in various 
offices 

5. Date for completion of entries on individual records 

In most programs, teachers give the tests and are trained in 
examinens’ meetings. In other cases, one teacher in each 
building is appointed as examiner, and attends meetings and 
gives all tests. Time programs are different for these tw'o plans. 
In some cases each teacher scores the tests of her own pupils; 
in other cases the scoring is done for teachers. Teachers are 
usually responsible for class record sheets and individual pupil 
records. 

Preparing to give tests. When new tests are given by trained 
examiners, new examiners, or untrained teachers, it is cus- 
tomaiy to hold one or more examiners* meetings. Among the 
activities of such meetings are the following: 

1. Explanation of the nature and functions of the tests, by director 
of res€!arch or chief examiner 

2. Giving fKirts of tests by chief examiner to illustrate method and 
to familiarize other examiners with tests 

3. Giv'ing of parts of tests by examiners for practice and criticism 

4. Explanation of accjring (even when examines do not score) and 
practice in unusual problems of scoring and scaring keys 

5. Explanation of and practice in transferring scores to record sheets 
and cumulative reci^nl cards 

6. Instruction in bundling, marking, and storing or shipping tests 
after records have Ireen made 

Scoring tests. Good group tests can usually be scored by 
clerical service. Intelligence tests yielding M.A. and I.Q.’s 
only, survey tests, and other tests on which final results are 
primarily significant may be scored in this manner. However, 
in the case of intelligence tests providing a diagnosis of mental 
factors, and tests of skills of learning providing diagnostic 
profiles or analyses of learning difficulties, much is lost if the 
teacher does not retain the tests of pupils having difficulty. 
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Teachers need not actually score such papers, even though it 
is often desirable, but they should make a study of the items 
missed as a basis for remedial work. Some progress has been 
made in the development of a scoring machine. If it is success¬ 
ful, it will relieve scorers of much present labor and make even 
more comprehensive programs of measurement possible. 

When teachers score test papers, they may often save much 
time by working together, one reading the correct answers, 
and others correcting a number of papers simultaneously. In 
other cases, scoring keys covering whole pages make individual 
scoring faster and more accurate. 

The scope of the testing program. The scope of the testing 
program depends upon a number of factors, among them the 
size and nature of the administrative staff, type of administra¬ 
tive organization, available funds, the training of staff mem¬ 
bers, and the extent of previous efforts to provide reliable 
data on which to base teaching, supervision, and administra¬ 
tion. 

When no trained workers are available, the superintendent 
or principal usually assumes responsibility for the program. 
Frequently his experiences lead him to take summer courses 
in measurement to aid him. Where the system is large enough, 
an assistant superintendent who has administrative and super¬ 
visory duties as well carries on the measurement program as a 
more or less minor responsibility. Best results are obtained, 
however, when one officer has the measurement program as his 
major responsibility, and has sufficient authority to make it 
function. 

In some school systems, each department has grown up and 
functions as a little independent kingdom jealously guarding 
every privilege and defying any contaminating influence from 
the measurement or any other department, regardless of learn¬ 
ing or the welfare of children. Frequently, by additional cleri¬ 
cal service and tact, such departments can be tempted to 
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Utilize the results of measurement to improve their work. 
What is often needed, however, is a major operation to convert 
a collection of departments into a school system. 

Frequently measurement is looked upon as an interesting 
project — something which is done to demonstrate that schools 
are up-to-date but not an essential activity. The same atti¬ 
tude in engineering would be that of a board of directors or 
superintendent of buildings who might decide that stresses and 
strains and strengths of materials w'ould be computed on build¬ 
ings and bridges during September, but could be guessed and 
estimated at other times. Measurement should not be the 
tinkering or mere playing with an idea. Except in its begin¬ 
ning stages and when new instruments are being tried out, the 
measurement program should include the whole school or 
system and should be required. It is time to assume that all 
teachers are not only willing but anxious to utilize the best 
procedures available. Accordingly, they should be provided 
with necessary facilities. If the training of staff members did 
not include diagnostic and appraisal testing, or if such training 
was limited, the administration should aid teachers in obtain¬ 
ing such training. 

The required measurement program should continuously ob¬ 
tain and provide, for appropriate members of the personnel, 
information concerning 

1. The mental maturity, intelligence, or general ability of pupils 

2. The success with which pupils have mastered the basic learning 
skills, and points at which major learning efforts should be 
directed 

3. The success with which individuals, classes, and schools are at¬ 
taining the total objectives of the curriculum, and at what points 
problems of objectives, materials, or procedures exist 

4. The specific aptitudes and limitations of pupils as they progress 
to where choice of courses and programs are available 
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Cost is not an important factor in a measurement program. 
The cost of attempting to aid each student to attain the objec¬ 
tives of the curriculum (instructional costs) ranges from $75 
to $300 per year. Fifteen to twenty-five cents per student per 
year for tests is all that is required to direct the larger expend!,, 
ture wisely and to furnish some evidence of its success. In the 
past, no such large expenditures of public monies have ever 
been made with so little evidence of outcome; it is becoming 
increasingly difficult to increase school budgets without more 
adequate proof of the needs of children and youth, and the 
success with which such needs are met. The guidance of edu¬ 
cational activities and the appraisal of results should not 
depend “upon the funds available.” We can no longer afford 
this haphazard policy. 

Mental-maturity, intelligence, or ability tests. The nature 
and function of tests at different periods of development, 
changes in pupils, errors of measurement in testing, the im¬ 
provement tests themselves, and changing ideas of teachers and 
administrators concerning their uses are among the factors that 
make it advisable to set up a definite testing program. One 
determination of an I.Q. without any analysis or diagnosis of 
its nature and significance will no longer do. Proper guidance 
requires periodic examinations. Some school systems schedule 
intelligence examinations every two or three years. Others 
test at somewhat longer intervals. 

Actual uses made of such tests have been reported by many 
communities in personal letters to the writer. Stenquist of 
Baltimore, for example, reports that, “Intelligence tests help 
us estimate the general academic ability of pupils in every 
school. They are repeated about every two years. Cumu¬ 
lative records are kept.” Woods of Los Angeles indicates 
that two major uses are made of intelligence tests: (l) As an 
aid to teachers in estimating the rapidity and amount of prog¬ 
ress which they should expect of pupils of their classes, and 
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(2) as an additional aid in grouping children in some schools. 

Wadleigh of Reading, Jvlassachusetts, writes as follows: 

Specifically I use intelligence tMta for: 

1. Entrance to grade I, sectioning into mental age groups 

2. Assisting as one of ten factors in ability grouping (the I.Q. takes 
fourth place) 

3. As an indication of ability in studying a pupirs promotion or 
amignment (one of ten factors) 

4. In advising pupils' guidance in high-sc:hool or college subjects, 
when I use the I.Q. as prognosis of future success in academic 
work. 


Moderow of Louisville replies as follows: 

In no ease is any of the questions l>ekm* decided entirely upon the 
results of intelligence tests, but intelligence tests often help in de¬ 
ciding them. 

1. Adequacy of achievement of an individual or a group as compared 
with the mental ability of that individual or group 

2. Entrance into grade 1B of bright underaged children 

3. Placement in various tyjies of sjx'cial claswes, as claves for the 
retarded, classes for discipline cases, sight-saving classes, and 
the like 

4. Placement in fiarticular instructional groujis: for example, in the 
grouping of the 7H pupils as they enter the junior high school 

Meyers of Kanstts City makes the foUowdng uses of intelli¬ 
gence tests: 

Group Intelligence Tests are given in the fourth, sixth, seventh 
grades, and freshmen and senior classes. These are for classification 
and for comparisiin with achiev'ement and for use in the study of in¬ 
dividuals. Individual intelligence tests are given by the physical 
clinic to all kindergarten children; then, to all first-, second-, and 
third-graders who did not receive the kindergarten test. 

The following program of mental-maturity or intelligence 
testing is suggested to meet the needs of those who participate 
in guiding educational activities; 
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1. An individual intelligence test, or a group test specifically de¬ 
signed for this purpose to all entering first-graders. 

2. A checking (group or individual) test is desirable at the end of 
the first year after pupils have made an adjustment to school life. 

3. A group test at the end of the third year, particularly if no check¬ 
ing test was given at the end of the first year; from grade one to 
six is too long a period for teachers to be guessing. 

4. A group test at the end of the sixth grade when the pupil is about 
to enter the junior high school, grammar school, or upper grades. 

5. End of the eighth or ninth grade (depending upon school organ¬ 
ization) when student transfers to high school, or senior high 
school. 

6. End of the senior high school when student is ready for college, 
technical school, or is a candidate for employment. 

7. Beginning of the period of higher or technical education for all 
students for whom no reliable data are available. 

In addition, trained teachers should be available to do in¬ 
dividual testing under such circumstances as the following: 

1. When new pupils enroll. 

2. When there are significant discrepancies between the results of 
group tests; students should in such cases be checked with in¬ 
dividual tests. 

3. When pupil achievement deviates too far from expectations. 

4. When pupils are failing. 

5. When there are problem pupils. 

6. At all other times when ability is a probable factor in solving a 
learning difficulty of a pupil. 

Such a program will yield continuous information over a 
period of twelve years to aid in the guidance of instruction at a 
cost of approximately fifty cents for tests for the total period. 

Testing fundamental learning skills. The fundamental 
skills are the tools of learning. Pupils may possess ample 
ability to learn, as determined by intelligence tests, but unless 
they have mastered these tools, they have difficulty in learning. 
In the beginning, these skills constitute a major part of the ob¬ 
jectives of the curriculum. As pupils grow in mastery, these 
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skills become less and less the direct objectives of learning and 
more the tools for attaining other types of outcomes* 

The correlation between the mastery of these skills and other 
types of achievement is much higher than between LQ. and 
achievement, some of the correlations running as high as .85 
on a battery of skill tests. While curriculum content and ob¬ 
jectives change as development proceeds, the skills are basic 
and more or less universal, and are important regardless of the 
psychology or procedures in which teachers believe. 

Tests of basic skills should be given at least once each semes¬ 
ter or quarter, preferably at the beginning, until the ninth 
grade, and whenever significant learning difficulties are ex¬ 
perienced thereafter. Some schools use a twelfth-grade learn¬ 
ing skills survey to predict college success. Many students 
who cannot read are still recommended to college. 

Table 40 illustrates the Philadelphia annual testing program 
for grades one to nine. A similar program is utilized in the 
senior high schools. 

If standardized diagnostic tests of these sldlls are given, 
normative information in the form of grade placements may be 
obtained from the same scores. 

Such a program, given at the tieginning of each semester, will 
cost approximately fr.25 for each pupil for twelve years. 

(xeneral achievement tests. C^eneral achievement tests 
have as their primary function the determination of whether 
or not students have attained a given group of objectives 
related to a subject, unit, course, or semester. They should 
logically be given, therefore, at the end of a quarter or se¬ 
mester, or at the conclusion of a subject or course. 

Standardized achievement tests should be given in every 
subject and at the end of every unit of work or course. Only 
in this way can definite information concerning achievement 
ever be obtained. Althougn content differs in different cur¬ 
ricula, tests can be carefully selected and norms adjusted, if 
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desirable, to local conditions. The disadvantages of such 
measurement are much to be preferred to the present situa¬ 
tion, where teachers guess instead of measure, where teachers 
at succeeding levels complain about the previous preparation 
of students, and where students are passed on regardless of 
learning success. 

The purpose of achievement testing is not to fail students 
but to guarantee a minimum of learning. At present they fail 
to learn many things and to attain desirable outcomes because 
no real evaluation of learning is ever made. Valid tests must 
succeed the invalid and unreliable essay examination, and ma¬ 
terials and procedures must be utilized to attain at least a min¬ 
imum of important outcomes. The cost of this program de¬ 
pends upon the number of tests given. At present, less than 
ten cents per pupil is spent annually for achievement tests. 

Tests of specific aptitude. Tests of specific aptitude should 
be given when the student has a choice of subject or program. 
The number of such tests is small in comparison with the num¬ 
ber of achievement tests, but they should be more widely used. 
No present test of aptitude will yield a perfect prediction. Too 
often teachers have tried them, discovered their limitations, 
and discarded them, forgetting that such predictions, even with 
their limitations, are more reliable than their own judgments. 
The scores of tests of specific aptitude are another type of 
objective evidence which may be considered in making deci¬ 
sions or recommendations. 

Tests of reading readiness in the first grade, tests of scientific 
aptitude, mathematical ability, and language ability are 
among the common tests of this type. 

Tests of personality and character. The development of 
well-balanced, effective personalities depends upon the ade¬ 
quate direction of the whole program of educational activities. 
Evidences of personality and adjustment problems confront 
the teacher daily. To guard against errors due to misinter- 
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pretation of limited observation, systematic recorded observa¬ 
tion is essential in many problem cases. A number of helpful 
instruments and devices have been described in connection 
with adjustment diagnosis in the primar>% elementary, and sec¬ 
ondary and higher schools, in Chapters Five, Seven, and 
Nine, respectively. For the most part, available tests have 
been used to study individual problem pupils. Their wider 
use with normal children will contribute to better adjustment 
of all pupils. 

The testing program. In summary, therefore, a program 
adequate to furnish necessary information to guide and ap¬ 
praise learning would resemble that illustrated in Table 41. 
The cost is a negligible factor as already indicated. Such a 
program would insure a great saving in valuable time since 
educational activity would be better guided toward unattained 
objectives instead of merely engaged in and repeated, Fur- 
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thermore, it takes less time to make adequate diagnosis and 
appraisals with, valid tests than many teachers now spend 
regularly on traditional examinations. Most tests can be 
given in one class period; comprehensive batteries giving 
results in three or more fields can be given in two periods. 
Evaluation is not an added activity, but an essential part of 
good teaching procedures which justifies itself on the basis of 
results. 

Testing program and size of community. The size of the 
school or school system should have no particular influence 
on the testing program. The same information is needed in 
all communities, and the accidental birthplace of a student 
should not determine the quality of his education. Some 
small communities have well-trained, alert principals and 
superintendents who have excellent measurement programs. 
With relatively less well-trained teachers than larger com¬ 
munities, they are apparently doing work of excellent qual¬ 
ity. On the other hand, several of the larger school systems 
of the nation have been virtually untouched by the educa¬ 
tional progress of the past twenty years, and know relatively 
little about quality of teaching or adequacy of outcomes in 
their school systems. In some small communities, lay school 
boards limit the measurement program because they have not 
yet found out that teachers cannot obtain valid information by 
the opinion and estimate method, and they hesitate to spend 
money on what appears to them to be an unnecessary activity. 
In larger communities, even where money is available, the 
same attitude on the part of a few or even of one school official 
often produces the same result. 

Beginning a measurement program. A program should be¬ 
gin preferably on a modest scale and grow gradually into full 
stature and usefulness. Teachers must learn the nature and 
functions of tests and how to handle them; the measurement 
department must devise techniques for doing its work most ef- 




ficiently and for providing the maximum assistance to all de¬ 
partments. The department must not be allowed to drift. 
Two to three years are sufficient, with proper planning, to build 
an effective program. 

The following suggestions are offered in beginning a new 
program: 

1. Give Intelligence and reading-readiness tests to entering pupils, 
and intelligence tests to sixth and ninth or tenth grades the first 
semester. Plan to cover the entire system in three or four 
semesters or quarters. 

2. Every class should have at least one test the first semester; the 
most useful for teachers are diagnostic tests of the basic learning 
skills. Of these, reading is most important, but all of them 
should l>e tested. They can l>e handled more easily in battery 
form than separately, but some of the separate tests are somewhat 
more comprehensive. 

3. Add one or two achievement, specific aptitude, or other tests 
each quarter or semester until the pirogram set up> is complete. 

4. Keep in touch with teachers, suptervisors, and administrators, 
and utilize their suggestions as far as pjossihle. 

Makiag the measurement program function. No measure¬ 
ment program can adequately serve its purpose unless it is 
properly organized, properly administered, and is general 
rather than optional. 

The necessity for a record system whicJh preserves vital in¬ 
formation and gets it where it may be utilized has already been 
considered. Local conditions may sometimes Justify beginning 
a program on an optional basis, but proper leadership wdll not 
long allow it to remain on such a basis. The resistance of 
teachers is often based on fear of how results will be used. 
When they realize that the major function of measurement is 
to make teaching surer, easier, and more pleasant, and that it 
is not primarily a method of Judging teaching, their attitudes 
change. The relative impotence of some programs is due more 
to the timidity of leadership than to resistance of teachers, a 
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timidity often based on ignorance of the nature, function, and 
possibilities of measurement, a suspicion of the measurement 
department activities which they never quite understand, and 
uncertainty regarding the unpredictability of certain members 
of boards of education. A board of education should set poli¬ 
cies, and not attempt to administrate them. Measurement is 
a technical matter, and no lay board member should attempt to 
dictate materials or procedures in this instance any more than 
he would attempt to dictate to a doctor or an engineer how to 
diagnose or measure. 

The head of the measurement program should be a teacher 
as well as a measurement expert, preferably a teacher who has 
become expert through an understanding and utilization of 
measurement. Occasionally, the head of a measurement de¬ 
partment has an excellent technical education and can direct 
research, but because of lack of knowledge of classroom activi¬ 
ties and problems is unable to obtain, analyze, and prepare 
materials to be of greatest assistance to teachers and various 
supervisory and administrative departments. The head of the 
measurement department should ideally have this activity as 
his major interest, and should have sufficient authority to ad¬ 
ministrate the policies and program agreed upon. Some pro¬ 
grams are only half effective because they are incidental, or 
because there exist interdepartmental jealousies or uncertain¬ 
ties regarding lines of authority. 

Organization of measurement departments. Where the 
principal or superintendent of a smaller community adminis¬ 
trates the measurement program directly, no problem of au¬ 
thority exists. However, where the measurement program has 
become a specialized service, new problems arise. 

Some heads of measurement programs have the rank or 
freedom of assistant superintendents. Others are responsible 
to assistant superintendents or directly to superintendents. 
Still others have committees (assistant superintendents and 
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Others) between them and the superintendent. Many are of 
uncertain authority. 

Figure 70 presents the organization plan of the Bureau of 
Measurement, Statistics, and Research of the Baltimore public 
schools, headed by Dr. John L. Stenquist. This organization 
has functioned to give Baltimore an excellent testing program. 

This Bureau maintains indirect contact with the pupils of 
the schools through a continuous program of tests. Test 
records of every pupil in the system are obtained and studied 
each term. The Bureau works in close co-operation with all 
supervisory and administrative offices. A constant knowh 
edge of pupil outcomes makes it possible to administrate the 
various departments of the school system so that they "‘con¬ 
verge continually on improved instruction.” 
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Because of the peculiar functions of the measurement pro¬ 
gram in obtaining, analyzing, and preparing data which serve 
as the basis of teaching, supervision, and administration, it 
is desirable that the head of the measurement program should 
have the freedom to work which is enjoyed by assistant super¬ 
intendents. This may be accomplished in many ways. Often 
personalities function where mere authority would fail. A 
board of superintendents or principals which sets policies and 
allows the head of the measurement department to adminis¬ 
trate them, with the assistant superintendent responsible only 
to the superintendent, or director responsible directly to the 
superintendent, or with similar freedom and authority, appears 
to be most successful. 

Outlook and immediate needs. Better measurement pro¬ 
grams are inevitable, but are developing slowly. New theories 
and practices with vague and uncertain objectives confuse 
teachers and administrators. Ability to devise measurement 
instruments has outrun the ability of theorists to define or ob¬ 
jectify some of their alleged outcomes. Teacher-training in¬ 
stitutions have offered too much speculative philosophy and 
too many dogmatic devices and have been slow to prepare 
teachers for an objective, scientific attack on their problems. 
However, two to four units in measurement are gradually 
coming to be offered or required, and teachers will soon be able 
to co-operate intelligently in the solution of many problems 
formerly handled in informal manner. 

“Measurement is attaining the status of a program rather 
than a stunt,” says Nifenecker of the Bureau of Reference, Re¬ 
search, and Statutes of the New York City Public Schools. 
“The weakness of testing surveys of the past was that they 
were isolated projects. They did not form part of a continu¬ 
ous program, and failed to provide for any organized follow¬ 
up.” Speer of Jerome, Arizona, gives primary consideration 
to follow-up and makes test results yield their maximum 
service. 
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Woods of Los Angeles indicates that learning skills should be 
checked twice each year, and that more tests of attitudes and 
thinking power should be utilized. Bixler of Atlanta suggests 
the desirability of a liberal use of carefully validated tests of¬ 
fered by test publishing companies, rather than the use of too 
many developed by school systems themselves. Moderow of 
Louisville believes that the greatest weakness of present pro¬ 
grams is their failure to return analyzed results to super¬ 
visors and teachers in time to be of maximum value. Sten- 
quist of Baltimore would enlarge and strengthen measurement 
programs, make every effort to find meclianical devices to 
reduce labor, including scoring machines and calculating and 
classifying summary sheets produced as far as possible by 
automatic machinery. 

Measurement programs are going forw'ard. The needs of 
children demand them, and self-respecting members of a pro¬ 
fession will not continue to utilize methcxls of proved impo¬ 
tence which compare unfavorably with those used by trades¬ 
men. hleasurement is the most powerful single factor yet 
discovered in improving teaching, supervision, and adminis¬ 
tration; we must educate teachers, supervisors, and adminis¬ 
trators to utilize it. 

QUESTIONS AND EXERCISES 

1. Explain the relation betw^een the measurement program and 
teaching, supervision, and administration. 

2, are the essentia! features of a record system? Devise a 
system which you think would function. 

Make a list of the imixirtant facttns in selecting testa. 

4. Assume that you are the director of a measurement program. 
Write a bulletin announcing the fall testing program and prepara¬ 
tions for handling it, 

5, Write a report on what you think w'ould happen at a successful 
meeting for training teachers to give a diagnostic test of learning 
skills. 
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6. Indicate what you consider a complete testing program. Criticize 
the program suggested in the text. 

7. Kxactly what would you do in the way of intelligence-testing and 
why? 

8. V\^hy is the testing of fundamental learning skills important? 

g. Indicate what other types of tests you would use and under what 
conditions. 

10. How would you begin a measurement program? 

11. How would you organize a measurement department? 

12. What are the problems, difficulties, and outlook for measurement 
departments? 
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‘CHAPTER SEVENTEEN- 


MARKING AND PROMOTION 

The unreliability of traditional methods has resulted in much 
confusion in marking and promotion. Not only parents and 
pupils, but administrative officers as well have sometimes sus¬ 
pected the honesty and good will of teachers. Students have 
appeared inconsistent in the quality of work done in the classes 
of different teachers. Marks have been awarded on a great 
variety of bases, many not closely related to actual growth and 
development. Workers have tried many methods of equating, 
transposing, and otherwise changing marks, building vast 
superstructures on foundations of shifting sands only to pro¬ 
duce marks little if any more reliable or valid than the unreli¬ 
able data with which they started. If marks are to be given 
and report cards issued, they should be based on the best ap¬ 
praisal-data which the measurement program can produce. 
Even then, they leave much to be desired. 

Why we mark. The Second Yearbook of the California Ele¬ 
mentary Principals’ Association includes a survey of approxi¬ 
mately 280 pupils in the fourth, fifth, and sixth grades, which 
reveals pupil attitudes toward report cards. The following 
questions were asked: 

1. Do you like to receive your report card? 

2. Do you talk with your parents about your marks? 

3. Do your parents offer rewards for better marks? 

4. Do your parents punish you for poor marks? 

5. Do your parents encourage you, and help you to do better next 
time? 

6. Do you try harder when you get a i or 2? 

7. Do you try harder when you get a 4 or 5? 

8. Do you like to ask your parents to sign your card? 

9. Are your parents always ready and willing to sign your card? 
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The pupil replies are summarized in Table 42. 


Table 42. Result of Pupil Questiokkatre 
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In a current survey, 907 pupils, 212 teachers, 139 parents, 
and 27 principals, located in 35 different school districts and 
ranging from grade one to twelve, were asked to respond to the 
question, “Should marks be awarded and report cards issued? “ 
Their replies are presented in Table 43. 


Table 43. PupiL-TEACHER-PAREKT-PKiN'cTrAL Replies to 

REP0RT-C!ARIJ QrKSTJDNNAlRK 
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All groups in Table 43 
marks and report cards. 

voted overwhelmingly 

in favor of 


The following reasons given for marking and issuing report 
cards were considered most important by all groups: 


1. To indicate htw well pupils are lE*^arning 

2. To inform pupils and parents of pupil &ucc^.m and progress 

3. To decide if a pupil is to i>e promored or graduated 

4. To force children to fiehave in school 

5. To make pupils study 

6. To deride on awards, honors, and special privilegee 

Evidence to be presented later indicates that traditional 
marks do not reveal pupil success, that they are unreliable and 
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lack validity, and that they are not an adequate basis for per¬ 
forming the services indicated in numbers 2, 3, and 6 above.’ 
Numbers 4 and 5 s-re not legitimate uses of marks, even though 
the possibility and fear of failure does lead to greater effort 
on the part of many students. 

Classroom teachers on every level, school administrators, 
and journalists have denounced marks, because they reveal 
nothing definite concerning the success or failure of students. 
They have described marks as bribes or penalties which would 
be unnecessary if teachers really knew how to utilize worthy 
motives. Table 42 revealed that a majority of parents punish 
pupils for poor marks. Yet these marks do not reveal success 
and failure, but depend very often on what teacher does the 
marking. Research reveals that first impressions of students 
on teachers frequently persist. Marks are rarely raised or 
lowered more than one point on a five- or six-point scale during 
a quarter or semester, regardless of actual changes in student 
achievement; yet a majority of pupils, teachers, parents, and 
principals (Table 43) still wish to retain them, and most col¬ 
leges and universities are in agreement. If we persist in giving 
marks and issuing report cards, their limitations as well as the 
problems which they create should be understood. 

What we mark. Marks if awarded should represent actual 
measures of pupil outcomes, whether they be facts and infor¬ 
mation, skills of the basic-learning type, or thinking, apprecia¬ 
tions, understandings, ideals, attitudes, or habits. However, 
teachers persist in consciously attempting to include such fac¬ 
tors as effort, co-operation, initiative, originality, and the like, 
in marking students in spelling, English, and history. Such 
traits and others are highly desirable. Educational activity 
should be directed to their development, and actual outcomes 
should be recognized, but a mark compounded of English, co¬ 
operation, and effort, if it could be obtained, would be mean¬ 
ingless. Such traits are major elements in themselves and 
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should receive independent evaluation. Incidentally, if a 
pupil possesses such trails they should result in higher achieve¬ 
ment in English, but only actual attainment in English should 
he represented by the mark in English, regardless of the factors 
which contributed to that mark. Reducing achievement 
marks because pupils posst‘ss personality defects or fail in 
other ways to make desirable adjustments is dishonest and 
should be discontinued. 

TraditSonai markiog- Five studies in the variability of 
marking were presented in Table 3, Chapter One, which re¬ 
vealed that teachers regularly assign to the s^lme examination 
paper marks which differ from as much as thirty to fifty 
points. It follows that the passing mark as popularly con¬ 
ceived and utilized by teachers has no real existence. It 
is not a stable standard which represents the division point 
between passing and failing educati<inal performance, but 
rises and falls in response to other fador.s, including the teacher 
who marks the papers, the difficulty of the examination ques¬ 
tions, the familiarity of pupils with the considerations in¬ 
volved in the examination questions, and the amount of time 
allowed. As a result, the passing marks adc^pled by various 
schools have little if any relationship to actual standards of 
work. 

Figure 71 pre.sents the qucBtions and answers on an essay 
examination in Physiology. Many teachers still utilize this 
type of examination in the classroom. Candidates for teaching 
positions or special honors in college and university, and ap¬ 
plicants for many types of employment are required to take 
them. Table 44 constitutes a summary of the marks awarded 
by thirty-one teachers on the examination reproduced as 
Figure 71, and the distribution of marks on Question 3. Note 
that the spread of marks on the single Question 3 is approxi¬ 
mately sixteen ptiints, and that the range of marks on the 
total examination is about fifty ptjinti. 
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fig. 71. ILLUSTRATION OF TRADITIONAL ESSAY EXAMINATION 

It makes little difference whether the total examination or 
just a part of it is marked, whether the same per cent is allowed 
for each question or all questions are weighted, whether the 
markers are experts in this field or have had no preparation in 
it, whether they knew the pupils or teacher of the class or not, 
or possess many other types of information. Wide variations 
will occur when the same paper is marked by different teachers. 
If a teacher re-marks a set of papers after a period of time 
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Table 44. Per Cent of Marks Assigned to Question 3 of 
Figure 71, and to Total Examination 
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variability will be evident, wide differences in marks occurring 
if she does not know the pupils, and somewhat smaller differ¬ 
ences if she does. 

Finally, when teacJiers were requested to write the correct 
answer to a given question, less than fifty per cent of any group 
sampled w'ere found to lie willing to accept any answer written 
by any memfier of the group. 

The normal-^urve-marking concept. We have learned in 
Chapter Fourteen that the normal distribution so far as educa¬ 
tion is concerned is statement of how ability and achievement 
are expected, theoretically, to fie distributed in the sdiool 
population as a w’hole. 

Results of surveys indicate that perhaps Efty per cent of the 
school systems of the United States use some form of normal- 
curve marking.*’ The particular per cent of marks of each 
kind are usually used as a guide only, but deviations must often 
be defended in terms of particular conditions w^hich exist. 
Table 45 reveals the different per cents of students assigned 
various marks. 
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Table 45. Percentages of Stxjdents Assigned Each Mark 
Under Various Systems That Use the Normal Curve as a 
Basis for Marking 
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‘ Tiega, E. W., and Crawford, C. D. Staiislics for Teachers, p. 117. Boston, Honghton 
MifHln Company. 1930, 

Table 44 revealed that the same paper may be awarded any 
mark from Fail to A. Use of the normal curve usually controls 
the number of marks of each type, but does not guarantee that 
A's are really A’s, and not B’s, C’s, D’s, or Fail. The P.E. of a 
mark on a five-point scale has already been indicated as more 
than a whole point; that is, when a mark of B has been awarded 
the chances are more than 50-50 that it should have been a C 
or an A, or some other mark. 

Among the many arguments urged against the use of the nor¬ 
mal-curve hypothesis are the following: 

1. Normal distributions of marks are never obtained except by 
chance; to use this assumption as the basis of marking overworks 
the accidental nature of test results. 

2. The teacher’s whole effort is to make students more alike with 
respect to important outcomes; even though she begins with a 
normal group, she should produce a skewed distribution if her 
efforts are fruitful. 

3. The normal-curve assumption carries the implication that some¬ 
one should be failed, regardless of the ability of the group or 
amount of learning which occurs. 

4. The use of the normal curve actually causes variability in marking 
by awarding the same number of marks of each kind to groups 
which differ greatly in achievement. 
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5* The normal curve should not Ije used in colleges and universities 
since studenia are a highly selected group. 

6. The norinai curve shouUi not l>e used iit pre-professional or 
professional courses; students should not be imssed until they 
have attained all t)b|cctiv-cs of their courses. 

7. The normal curve should not l>c used in advanced undergraduate 
Of graduate clasm» where a large measure of elimination has taken 
place from the original starting group. 

а. The riorntal curve encourages teachers to l>c vague and deal in 
generalities instead of Inking definite with regard to objectives 
and aiding each student to attain them. 

Among the arguments in favor of the normal-curve hypothe¬ 
sis are the folknving; 

L There is less opfiortunity to use marks for disciplinary purposes. 

2. In the long run, dei>enderKX‘ on the normal-curve hyjiothesis will 
result in more corretd marks, than will awarding marks indis¬ 
criminately. 

3. The use of the normal curv^e results in less injustice than does 
marking indiHcriminately. 

4. 'riie iim* of the normal curve prevents certain dejiartments from 
obtaining an unfair share of honors and awards by giving their 
students tcK> high marks. 

5. Traineti teachers can cibtain evidence t>f the mirmality of classes 
through valitl teats and then adjust marks accordingly. 

б. Using the curve cjuiscs com|>eiitum hw marks and rai&ro stand¬ 
ards of achievement- 

7. The normal curve wall not caustt marked injustice unless used 
mechanicaily. It should lie used only as a jjoint of deimrture. 

The normal-curve hypothesis ts helpful as a temporary mark¬ 
ing-control device where the* olijectivc^ of a course, subject, or 
unit have not been definitely tieiermineti and defined, and evi¬ 
dences of attainment determined or agreed upon, so that valid 
appraisal is possible. When teachers are trained in measure¬ 
ment, there is relatively need for depending upon the laws 
of chance in marking in the learning skills, for example, than in 
the social studies. The specific objectives in the learning skills 
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are fairly definitely determined and evidences of their attain¬ 
ment are more easily identified. In fields where objectives are 
traditionally vague and change from time to time, and evi¬ 
dences of alleged outcomes consist largely of assertions, imag¬ 
inative arguments, and firm convictions, however sincere, no 
real basis of marking in terms of actual outcomes can exist. 
The normal-distribution assumption is then as good as any 
other assumption about the distribution of student achieve¬ 
ment. 

One of the weakest aspects of the use of the normal curve is 
the treatment of failure. Pupils should ordinarily progress 
regularly through the grades of the elementary schools and 
should not be failed unless they cannot proceed with profit. 
The determination of the probability of success in the next 
grade should not depend upon a more or less accidental position 
at the lower end of a curve as the result of unreliable methods 
of appraisal. This is a matter for experimental determination 
with valid measurement instruments. Present evidence indi¬ 
cates that a boy who appears to have attained very few of the 
objectives of fifth-grade arithmetic may, with a new teacher 
and different motivation, make a good record in the sixth 
grade. 

Ability grouping and marking. Temporary ability groups 
may prove helpful in teaching, but the more or less permanent 
section variety is a source of confusion and difficulty in mark¬ 
ing. If marks are awarded on the basis of traditional examina¬ 
tions and other unreliable methods of appraisal, there is no 
actual measurement anyway. However, the use of objective 
tests does not solve the difficulty. 

The nature of the problem will be better understood by ref¬ 
erence to Table 46- 

Table 46 presents objective test scores earned by three 
classes which prove to be of varying ability so far as this test is 
concerned. However, there is much overlapping of scores; 
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a D in Class i may be equal to a C in Class 2 , or to a B in Class 
3. If the poorest ten,^ five, or three per cent of each class were 
to be failed in accordance with the practices of certain com¬ 
munities, those failed in Class i would receive C^a in Cla^ 2 
or 3. Similar discrepancies would appear if all three classes 
were thrown together as illustrated in Columns 8 and 9. 
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complicates their determination. If a complete range of marks 
cannot be given for each class, many teachers hesitate to teach 
the lowest groups, particularly where no special classes have 
been established and dull normals and subnormals are taught 
in low sections of regular size. 

The way out, which is utilized by many communities, is to 
give a full range of marks in each section, but to designate in 
which section the mark is attained. Thus A2 indicates an A 
awarded in the second section, and B3 indicates a B awarded 
in the third section. However, this practice is not very satis¬ 
factory, particularly when sections are organized on the basis 
of a single test, since a second test would place many students 
in other classes, and corresponding sections of different schools 
or school systems are not equal. A better solution of the mark¬ 
ing problem is needed. 

Traditional marks and test scores. Attention is again called 
to the distinction between testing and marking. Where no 
adequate testing is done, marks are awarded on a large variety 
of bases, including the traditional examination and opinions 
and estimates regarding a large number of other factors. A 
school may desire to discontinue marking, but begin testing, 
because valid tests may reveal learning difficulties of pupils 
and yield valid appraisals of achievement. However, if the 
school desires to mark and issue report cards, such marks may 
be based partially or wholly on the results of such tests. 

As long as schools are willing to promote pupils, high schools 
will give students credit, and colleges and universities will ad¬ 
mit students on the basis of traditional marks, this confu¬ 
sion will continue. Yet marks can be greatly improved and 
given a measure of comparability through the use of valid 
measurement. With the widespread alleged use of the normal- 
distribution hypothesis and enormous amount of time spent 
in determining marks, no effort whatever appears to be de¬ 
voted to the attempt to determine the normality of the classes 
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niarked. Wherever any adequate testing program has been 
established, this information can be obtained from the cu¬ 
mulative record cards by teachers. A normal group, so far 
as mental maturity is concerned, for example, is a class with 
a mean LQ. of loo and an S.D. of t6. The mean and stand¬ 
ard deviation for normabgroup achievement in sixth-grad-* 
arithmetic, ninth-grade science, or college freshman psychol¬ 
ogy can l>e determined in a similar manner by means of tests 
using some tyf>e of comparable or standard scores. Achieve¬ 
ment can be determined experimentally for groups with dif¬ 
ferent means (and S.D.*s) of mental maturity,* and marks 
can be aw^arded on the basis of ability to achieve. These are 
the marks which may appear on report cards to parents. They 
do not represent actual or absolute achievement, but are rel¬ 
atively comparable for pupils of equal ability. 

However, there is need for sctores %vhich represent as far as 
po^ible actual achievement, and for marks which are as abso¬ 
lute as po^ible. Both pupil marks and test scores may be 
entered on the pupil cumulative record, but the actual test 
scores on valid test.s sc-rve a functitm which cannot be served by 
traditional marks, or marks based on test scores. This in¬ 
cludes pupil guidance, investigation of the adequacy of objec¬ 
tives, and of materials and equipment, and evaluation of 
teaching, supervision, and administration. Later on, prospec¬ 
tive employers, collegeB, technical schools, and universities will 
desire this information as well. 

ImprovitLg the marking situation- No teacher responsibility 
has l>een the subject of more investigation, resolutions, con¬ 
victions, and conclusions than marking. Plans suggested vary 
greatly. On the one hand are fantastic schemes involving the 
averaging and weighting of six to ten factors all based on data 
of no value to begin with, which would take hours of time and 
a knowledge of multiple r^ression equations to obtain each 
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mark. Some constitute mere tinkering with the established 
practice, changing from a five- to a three- or two-point system, 
substituting S and N for Pass and Fail, or numbers for letters, 
and vice versa. Others recommend solving the difficulty by 
discontinuing marks entirely, or by dropping marks in measura¬ 
ble abilities and substituting traits and qualities that require 
much greater skill in measurement. 

In view of the fact that even for teachers trained in measure¬ 
ment much time is required for obtaining relatively valid 
marks, and that most teachers are not yet trained to do the 
work involved, abolishing the report-card type of mark en¬ 
tirely would appear to be the logical solution. Report cards 
might still be issued periodically, but reports to parents could 
be made in terms of the objectives which pupils have or have 
not attained. The inclusion of such items as co-operation, ef¬ 
fort, citizenship, and the like is justified from the standpoint 
of their importance as desirable outcomes. However, teachers 
are less able to mark these than certain other factors. 

Marks based on test scores. The validity of marks may be 
improved through the use of both informal and standardized 
tests- With the former, students are ranked more correctly, 
and the right pupils are likely to get the right marks. How¬ 
ever, to the extent that groups differ, such marks are not com¬ 
parable. 

Marks based on standardized tests may be more valid, since 
they may represent more definite amounts of the ability or 
achievement measured. Incidentally, grade placements or 
percentiles established on large unselected groups have been 
utilized for marking purposes. 

When, however, it is desired to transmute scores on tests 
into marks, several problems arise. First, some definite dis¬ 
tribution of the per cent of marks of each kind to be awarded 
must be agreed upon. There is nothing about the nature or 
mathematical properties of the normal distribution which de- 
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termines these per cents. Then, too, the validity of test 
scores is always reduced by transmutation into marks, and 
the same problem regarding failures exists as in the case of 
traditional marking. 

After definite t>er cents have been established for each mark 
the actual scores of classe.s frcciuently are so distributed that 
application of the rule is difficult or artificial. Often twenty or 
thirty f>er cent of a class will get jx’^rfect stiores on an informal 
test. To a lesser degree, this is a problem with poorly chosen 
staiidardiEed tests. Table 46 reveals the method used in 
awarding marks in four distributions. 

The problem of failure requires special attention. Mere 
accidental or correct legation at the lower end of a distribution 
of test scores for a particular test is no proof of failure. Evi¬ 
dence already presented indicates that this student might have 
been a C or B student in some other group. A mark of failure 
should not be awarded unless the .student cannot go on with 
profit, I'his would retiuire a careful exfM'rimenlai determina¬ 
tion for each test, giving the probabilities of success in the next 
grade or subjevt in terms of the stvire earned, and establishing 
a crucial score below'' which promotions might not be made. 
Such specific aptitude test.s have fit'en tleveloped as specialized 
instruments for a few secondary and college subjects, but have 
not lieen utilized in the elementary schools. Without scien¬ 
tifically determin«?d standards the leatdier must utilize her 
lK*st judgment, basted on the objective test scores as well as on 
other evidence. When the problem of failure has been settled, 
other scores may be redistributed. The following summary 
is offered: 


1. As far as pwilde give objective tests of sufficient length, variety, 
and adec|iiacy UfK^n which to all marks in subject fields. 

2, Where the group of srorra is small, simply rank them in order of 
size; but where the grenq) is largo, the aaircs may be tabulated 
*nto a frequences’ table for ea«* of hatidlimr. 
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3. ]V[a.kG 3. tcntativ© a-ward. of letter grades oa the basis of whatever 
system has been adopted for the distribution of grades. 

4. If information is available, determine whether or not the group 
is average, above average, or below average. This information 
may be obtained from the results of intelligence tests or the 
norms of achievement from standardized achievement tests. 
Carefully scrutinize the papers of those who on the tentative dis¬ 
tribution have been awarded D’s or Failures, and attempt to 
estimate their chances of success in continuing the work with 
profit. Adjust marks accordingly. 

The use of different tests with different numbers of items and 
of different difficulties on similar groups causes additional prob¬ 
lems. However, to the extent that the same per cent systems 
are established and followed, the same standardized tests used, 
and the means approximately equal, marks awarded are likely 
to be comparable. 

Success and failure. We have considered the limitations of 
the passing-mark concept as a measure of success, and the dif¬ 
ficulties of utilizing the normal-curve hypothesis to discrimi¬ 
nate between success and failure. While experimentally de¬ 
rived crucial scores for specific tests might be utilized, their ap¬ 
plication still requires a large measure of judgment. Regard¬ 
less of the marking or report system used, promotion and 
failure constitute different, even though closely related, prob¬ 
lems, There is no known absolutely reliable way for deter¬ 
mining probable success, if promoted, even though prognosis 
tests may aid. Such success still depends too much upon fac¬ 
tors other than ability and previous achievement. Again, 
what is regarded as pupil failure may in reality be teacher, 
supervisor, or administrator failure. Almost all children and 
youth can succeed in terms of objectives set up, if their difficul¬ 
ties are diagnosed and their remedial activities are properly 
directed. Certain weaknesses in the school program have 
forced pupils into a defensive attitude, and resulted in a num¬ 
ber of types of undesirable behavior. It is no wonder, there- 
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fore, that there has been so much demand for trial-promotions 
and no-failure programs. 

Trial promorions. Evidence indicates that the large number 
of pupils formerly failed in the attempt to “ uphold high stand¬ 
ards"*' was a grave mistake. Quality of work is determined by 
teaching-learning activities, and not by the number or percent¬ 
age of failing students. Studies of trial promotions indicate 
that from fifty to eighty-five per cent of the pupils so pro¬ 
moted succeed. One of the best-known studies is that of Buck¬ 
ingham,* made in the cities of Springfield and Decatur* Illinois. 
On two different occasions after teachers had made out their 
failure tests, 1276 the first time and S81 the second, all pupils 
were promoted on trial. Both times more than seventy-five 
per cent were successful. This is a true indication that there 
has been too much failure! 

The no-failure program. Marking and promotion difficul¬ 
ties as w^ell as changes in educational philosophy have lead to a 
demand for a no-failure program. Under this program, all 
pupils are expected to Ik* successful each quarter or semester; 
if any are failed the burden of prmif is on the teacher. Such a 
program is expected to stimulate teachers to study mental 
maturity, achievement, social maturity, and all other pertinent 
factors, to diagnose pupil neixis and diffic!iilties and direct 
leaniing activities so successfully that few if any pupils will 
ever fail. Others hold that scrhool is life, that **adjustment is 
more important than achievement,’* and that all pupils should 
be promoted as a matter of routine; they forget, hovrever, that 
achievement is a major factor in adjustment. The no-failure 
program is not the possession of any particular philosophy of 
education or school of psychology. It is, hovrever, more de¬ 
pendent on adequate diagnosis and valid appraisal than any 
other plan. 

» BiickjnMbii.a, B, B. ’‘Trial iT&inwtfoii m Mlnre”; fa Jmrmsti &/ Mdu^otfoTtat R4h 
tmrtih (Mur, *# 8*4 
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A survey of 1218 pupils, teachers, parents, and principals in¬ 
dicates that a majority in each group, even of pupils, do not 
favor the no-failure plan. Table 47 gives the ratio of their 
answers to the question, “Should there be a no-failure pro¬ 
gram? “ 


Table 47. Pupil-Teacher-Pabent-Principal Replies to 
No-Failure Questionnaire 


Answers 

PUPH.S 

Teachers 

Parents 

Principals 


431 

55 

48 

9 



451 

140 

67 

17 



Reasons given by those who favor the no-failure program 
are as follows; 

1. Failure causes inferiorities. 

2. Pupils of limited ability can succeed. 

3. Children will be saved the worry and chagrin of failure. 

4. Children can no longer be failed for disciplinary reasons; the 
teachers will use better motivation. 

5. Children will become accustomed to success. 

6. Happy children learn best. 

7. It is an easier program for teachers with low groups and wider 
ranges of ability. 

8. Little is learned in repeating. 

9. Failure merely discourages and does not aid pupils. 

10. Real failures cannot do the work no matter how many times 
they repeat. 

11. Failure results in placing pupils in the wrong age and social 
groups. 

12. There is no reliable way of determining failure. 

13. Marks are too unreliable to fail pupils. 

14. If the school places and teaches a pupil properly he will not fail. 

15. Parents are friendlier and will support the schools. 

16. Repeating wastes tax funds. 

Among the reasons given by those who do not favor the no¬ 
failure program are the following: 
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1. It lowers standards. 

2. Children will not wi irk as well if they know they will pass an3nvay. 

3. I'ailure is cumulative*; if .sJudents are alloweti to go on in mathe¬ 
matics when they are not suceeetling they Irecome ho{)eless. 

4. There arc Ixiurid to Iw? some failures; we cannot beat Nature. 

5. Teachers in st:hool8 trying the program say that pupils are doing 
lea and less. 

6. A pupil cannot Bucceed in advanced work without a proper 
foundation. 

7. The prtjgram is unfair to pupils, l>et*ause it turns them out of 
school unprefrareri for life. 

8 . We must face failure; scune deserve to fail. 

9. Failures rob the normal children of proi>er attention, 

10. Fear of failure stimulates elTort, 

11. It is not fair to pupils who study and succeed, to pass the lazy 
as well. 

12. Slow students should Iseconie aware of their limitation.^ and ad¬ 
just to them. 

13. Marks are an incentive. 

14. The miTailure program is contrar>^ to life; it is not a real prepara¬ 
tion for proldems of life. 

15. Effort and success should l>e rewarded. 

If teachers are unpreparcti, or misunderstand its nature and 
spirit,or utilise no adecjuale measurements of ability or achieve¬ 
ment in directing learning, the no-faihire program may lead to 
serious consequences. HK{K‘rience so far indiaites that there 
is probably as much danger as virtue in thcf programs so far 
established. The writer is in favor of the no-failure principle, 
but it must be based on such ex|x:‘rt knowledge of objectives, 
materials, and procedures, and on the utilization of such ade¬ 
quate technir4ues of diagnosis and apprai.sal that it will become 
incretmingly difficult for any pupil not to achieve succe^. 

Suraiimry, Probably too much importance has been at¬ 
tached in the past to marking and marking problems. Be¬ 
cause of the unreliability of the data used, much of this discus¬ 
sion and research has had no significant relation to actual learn¬ 
ing; attention frequently has been deflected from teaching* 
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learning activities to unproductive marking activities. The 
following facts, observations, and suggestions are offered in 
summary: 

1. Marking and the issuing of report cards will probably continue 
for some time to come. 

2. Traditional type marking is unreliable; no such reality as an ab¬ 
solute or stable passing mark exists; established passing marks 
have no demonstrable relation to quality or quantity of learning; 
and the use of some variation of the normal-curve hypothesis 
controls only the number of marks of each kind awarded, but 
does not guarantee their correctness. 

3. If marks are awarded, they should be based on the results of the 
most valid evidence obtainable; but even when based on the most 
valid tests, they are only relatively absolute or comparable, be¬ 
cause of such factors as differences between the meanings of 
scores of different tests, differences between the means and S.D.'s 
of different tests, and variations in the application of the normal- 
curve hypothesis in the distribution of marks. 

4. Unanalyzed or total M.A.’s or I.Q.’s should be used for group 
rather than individual prediction. When broken down into 
factors they are of more significance in guiding individual learn¬ 
ing and for predicting individual success. Tests of specific apti¬ 
tude are of more service in predicting individual success, and, 
together with achievement data, in judging the adequacy of 
individual outcomes. 

5. Achievement test data should be utilized primarily in con¬ 
junction with mental-maturity data to determine the adequacy 
of outcomes of classes, schools, and school systems, and as a point 
of departure for improving such factors as objectives, materials, 
teaching-learning activities, supervision, and administration. 

6. Systematic recorded behavior observation, and character and 
personality tests are useful primarily for guiding adjustment; 
relatively little is yet known about expressing vstriations in such 
adjustments in terms of marks. 

7. Diagnostic tests (and analyzed M.A.’s and I.Q.’s) should be 
utilized to anticipate and prevent so-called failure, 

B. The concept of failure should be eliminated from the thinking 
and appraisal' activities of teachers particularly during the 
compulsory education period. Children and youth can succeed 
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when sufficiently specific objectives are properly identified and 
materials and procedures are utilized in terms of their needs, 
interests, and ca|iacities. Imter, in pre-professional and pro¬ 
fessional work, a new principle must enter,^ and perfection or near 
perfection must l>e demanded. 

As time goes on, more and more tests will be published with 
scores equivalent to those of other tests. Marks based on 
such tests will then l>e more comparable. In time such scores 
may be used directly without being transmitted into letter or 
number marks. Crucial scores will tlien be more serviceable 
in pupil placement in the elementai-y stffiool, in promotion and 
graduation in the secondary school, and in admission, to col¬ 
lege, technical schcx)l, and university, and will serve as evi¬ 
dences of success in these schools. Pupil or student success 
must become more and more a joint re.sponsibility of teaching, 
supervision, and administration, as w’ell as student nature, 
ability, attitude, and effort. 


QUESTIONS AND EXERCISES 

1. Make a Ikt of reasons which you think justify the continuance of 
marking, 

2. What factors do teachers attempt to include in awarding marks? 
Evaluate this practice, 

3. Explain the discussion on the “passing-mark illusion.^ Criticize 
or evaluate it, 

4. Do you agree with the a^trtion that no demonstrable relationship 
exists between cstablishetl pas.sing marks and standards of work? 
Why? What criticism, if any, would you make of the author's 
illustration? 

5. What is meant by variability in marking? Evaluate the pre¬ 
sentation of the concept. 

6. Make a list of difficulties in utilizing the normal-curve hypoth^s 
in marking. How doe.s aliility grouping complicate it? 

7. Evaluate the di«!Ussion on improving the marking situation. 
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8. How does basing marks on valid tests improve them? What 
difficulties still remain? 

9- Give your own definitions of pupil success and pupil failure. 

10. What use would you make of trial promotions? 

II • What are the advantages and limitations of the no-failure program? 

12, If as a superintendent you had the privilege of developing an ideal 
pupil appraisal and promotion program, write the announcement 
which you would send to teachers explaining its provisions. 
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CHAPTER EIGHTEEN- 


REPORTING AND INTERPRETING 
THE MEASUREMENT PROGRAM 


The report and interpretation of the results of the measure¬ 
ment program can be one of the most effective means of train¬ 
ing teachers in service. It provides a perspective on larger is¬ 
sues as well as specific information on important details, such 
as variations in M.A.’s and I.Q.’s characteristic of grades and 
schools, weaknesses in the learning skills, and standards of 
achievement reached by classes and schools. Such a report 
shifts attention to actual pupil outcomes, and may be a power¬ 
ful instrument in unifying teachers in an intelligent co-opera¬ 
tive attack on the major related educational problems of a 
school system, instead of periodic and fragmentary attacks in 
isolated areas only. 

The comprehensiveness of such a report is limited only by 
the relative completeness of the testing program and the detail 
desired in interpretation. Space will permit examples of some 
of the most important factors only. 

Mental maturity or intelligence. Among the important 
aspects of this part of the measurement program are the num¬ 
ber of group and individual tests given, distributions of M.A.’s 
and I.Q.’s by grades, and the analysis of data in terms of pre¬ 
dictive value, failing pupils, and classification or grade place¬ 
ment of pupils. 

Table 48 presents a report of the'mental-maturity tests 
given in a small school system during the period of one school 
year. 

Table 49 presents part of a distribution, of I.Q.’s by grades; 
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Table 48* Mental-Maturity Tests Given 


Graob 

Group Tests 

Individual. Tests 

Entering 

0 

451 

I 

360 

0 

a 

0 

52 

3 

0 

13 

4 

t05 

40 

S 

0 

18 

6 

258 

30 

7 

48 

7 

8 

0 

0 

9 

0 

5 

10 

0 

0 

11 

0 

0 

12 

178 

0 


table 50, part of a distribution of M.A/s by grades. Both of 
these distributions are from reports of an achievement testing 
program of private schools. Note in Table 49 that the median 
are 112 or higher. Table 50 reveals that median mental 
ages are one to two years higher than normal expectancy for 
the school grades in which pupils are working. 

From these distributions teachers may predict that achieve¬ 
ment grade placements should be from .7 or .8 of a grade place¬ 
ment to 1.4 or 1.5 of a grade placement aliove the established 
standard of achievement for children of average maturity. A 
comparison of tht^se data with the ty|>e of information pre¬ 
sented in Table 52 will reveal, in general, the success with which 
pupils are mastering the tools of learning. However, the wide 
variations in mental maturity revealed in Tables 49 and 50 
will be reflected in achievement. 

Differences in mean Intelligence between schools are often 
significant, and are important in determining the adequacy of 
attainment of a number of different types of outcomes. Table 
51 presents the mean I.Q.% for twenty-two elemental^ schools. 
These schools average about 900 pupils each. 

Since these schools average 900 pupils, and the S.D, of the 
distribution of I.Q/s for each school averages sixteen, the 




Table 49. Kuhlmann—Anderson Intelligence Quotients, 
Kindergarten to Grade 4^ 



Total 

91 

205 

215 

359 

481 

03 

120.17 

119.84 

121.64 

123.61 

121.27 

Median 

112.S 

114-78 

115-32 

IIS-05 

114.18 

Qi 

I 02 . 7 S 

106.50 

108.22 

105.46 

106.85 

Range 

76-142 

80-143 

81-153 

77-154 

76-148 

Number 

Schools 

7 

19 

21 

35 

42 


* From the Educatfmal Records BulUlin, no. ig. Report of Achievement Testing Pro- 
gratn, January. 1937. 





Table 50. Kuhlmann-Anderson KIemtal Ages, Grades 5 to 8 


U.k. 

Scale 


Grades 


6 


7 


S 


22~0 

ai-S 

31-4 

31“0 

30-8 

acHO 

liHI 

19-4 

19-0 

lS*-& 

s8-4 

i8~o 

t7“S 

17-4 

17"0 

xfr-S 

i6“4 

x 6 ~o 

iSHS 

15"4 

1S“0 

14“8 

14-4 

14*0 

13-8 

13-4 


13“4 

13“0 

im 

1I“4 

Il“« 

19-8 

I0“4 

19-0 

9-8 

S>“4 

9-0 

8-8 

8”4 


X 


a 

6 


23 

34 

54 

84 

91 

78 

69 


45 

32 

X 4 

10 

a 

t 

t 

X 


■Md 


2 

2 

3 
1 

3 

4 
7 




XI 



x8 



16 



24 



15 



33 


t 

5? 



44 


5 

90 

a 

4 

59 

a 

16 

too 

i 

18 

75 

7 

33 

77 

3 

40 

130 

IX 

54 

91 

x8 

S8 

72 

24 

<k) j 

S9 

35 

t>9 

5X 

46 

98 

—Md 30 

61 1 

9t 

33 

74 i 
83 j 

X xo 

“•aia 

33 

10 

74 

74 

7 

57 

S9 

3 

54 

41 

X 

37 

14 


X9 

6 

I 

12 

3 



a X 

I X 


ToUl 

SOI 

623 

W9 

1x25 


12-3.7 

13-7.5 

14-0.8 

I7-I.2 

11-0.2 

12~I0.7 

X3”5.7 

IS-I04 

Qi 

XI "2 

ta- 2 .a 

12-9.0 

14-IX.2 

Range 

8-0 

io~3 

rO“3 

ll-o 


sS-4 

x 6-9 

17-xa 

aa-o 

Number Schools 

47 

S» 

Ox 

Os 


* From the Eduesikin&t BwW<r</«, no. 19. of Achieviement Testing Pro- 

gw», Jaatwity. XM 7 . 
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SCHOOi 

Mean I.Q. 

I. Audubon . .. 


3 . Jackson. 

y 4 

3. O’Shea. 

yo 

4 - Washington...... 

y / 

99-5 

99-5 

5. Dorn. 

6. Nibley . .. 

7. Detroit... 


8. L.annon.. 


g. 26th St. 


16. Tenth Ward. 


11. Lehigh. 


12. Davis. 

102 

102 

13. Central. 

14. 22nd Ave.. 

15, Adams.. 


16. Nero. 


17. Western. 


18. Valley. 

106 

19. Lincoln. 

106.5 

109 

109 1 

109 

20. William. 

21. Johnson. 

22. Lane. 



standard error of each mean I.Q. is approximately .5 I.Q., and 
the standard error of the difference between the means of any 
two schools, approximately .7 I.Q. To be significant, there¬ 
fore, mean I.Q, differences between schools must be at least 
3 times .7 or approximately 2 I.Q. points. 

In addition, it is enlightening to study the relative agree¬ 
ment of I.Q.^s and M.A.’s from different tests; individual tests 
should be requested where significant discrepancies exist. The 
regularity of size of increments of M.A. in succeeding grades is 
a general measure of classification. It is helpful to break down 
M.A.’s and I.Q.’s into their factors, and to note characteristic 
differences in factor-patterns between pupils designated as 
successful and as failures. 

The total percentage of failure for each school can be com¬ 
pared with the average I.Q. or M.A. Figure 72 presents the 
per cent of failures in the New York City day high schools in 
English during 1936. 

* From the New York City Schools, Thirty-Eighth Annual Report, p. 333. 



























BIG. 72. PEE CENT OF FAILURES IN ENGLISH BY HIGH 

SCHCX)LS 
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When schools with high average mental maturity fail high 
percentages of their pupils, a study of the marking system, 
materials, teaching-learning activities, and other factors is 
indicated. 

As pupils progress through school many drop out. In some 
communities selection results, those with higher I.Q.’s tending 
to remain in school. A distribution of I.Q.’s or M.A.’s of high- 
school seniors compared with the original group is a measure of 
this selection, and is of general value in predicting success in 
higher education. 

Learning skills. Among the important considerations re¬ 
lated to learning skills which reports of the measurement pro¬ 
gram might well include are the following: 

1. Distributions or graphs showing the success with which skills are 
mastered for the system as a whole, for schools, and for grades 

2. Studies of the specific shortages in unmastered skills 

3. Distribution of gains and losses in skill mastery over a period of 
time, or as a result of changes in teaching-learning activities 

4. Factors related to skill mastery, such as success in attaining 
other types of outcomes 

Table 52 presents a distribution of scores of basic learning 
skills for grades I to 6, obtained with the use of two batteries 
of the Progressive Achievement Tests. These results were 
reported from the testing programs of private schools. Mental 
maturity is considerably above average in these schools, as 
revealed in Tables 49 and 50. Among other things, these data 
reveal that these private schools achieve grade placements 
approximately one year above the public-school norms in these 
grades. 

Another type of report shows the status of a school system 
by grades in the fundamental learning skills, as illustrated in 
Figure 73. Such a report may be made for tests given at any 
time during the year, the lines to the right of the grade desig¬ 
nations representing the grade norms at the particular time 
the tests were given. 



TaBLB 52, I^OGRESSIVE ACHIEVEMENT TeST ClASS 
Medians Grades 1-6 * 
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miQii 

I 

2 

3 

4 

S 

6 

9-8 

iS"i 






3 

9.6 

14~il 






2 

94 

l 4-’9 






7 

9.2 

I 4“7 






3 

pa 

14 “S 






a 

8.8 

14-3 






10 

8.6 

14-6 






< 

8-4 

i3“io 






10 

8.2 

13-8 






6 

8.0 

13-6 






10 

7 .S 

13"4 





I 

16 

7^ 

13-2 





X 

32 

7-4 

13“0 





3 

20 


7 .a 

13-10 





7 

26 


7.0 

12-8 





7 

19 


6.8 

12-5 




I 

40 

31 

-Md 

6.6 

ia-2 




2 

14 

3 S 


12-0 




2 

28 

17 


6-3 

11-10 




15 

35 

31 


6.0 

11-7 



I 

24 

72 

--Md 

40 


5-8 

ll“4 



t 

17 

.56 


5.6 





20 

vM 

34 

S 4 

10-11 



3 S 

37 

34 

®j ,3 

10-8 



5 

18 

33 

24 

5-0 

10-6 



16 

43 

.tio 

9 

4.8 

10-3 



16 

47 


6 

4^6 

10-0 


6 

37 

50 

13 

7 

44 

9 ”io 


32 

27 

4 

1 

4.3 

9“7 


4 

33 

-Md ** 

6 

1 

4.0 

P'S 


16 

.15 

31 

1 


38 

9-2 


xa 

35 

11 

2 


3.6 

p-o 


24 

14 

2 

I 


34 

$-“10 



18 

2 



3-3 

f -7 

X 

SH 

15 
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3^0 

8"S 


27 

Xfd 




3 s 

8-3 


19 


X 



a.6 

7-11 

2 

32 

S 3 




3.4 

7“9 

I 

30 

1 




2.2 

7'-6 

2 

33 





3.0 


5 

17 





J.8 

7-1 

6 

11 





1.6 

6-10 

0 * 

-Md 2 





1.4 

6-8 

14! 






t.2 

6-5 

.•> 






1,0-1.1 

6-3 

1 






Total 


46 

263 

296 

3 SO 

434 

462 

^sUan 


1.98 

1,67 

3-53 

2.98 

4.64 

4.18 

S-S6 6.33 7.57 

4 ''» 5-93 6.67 

Qi 


1.48 

244 

3 56 

4-63 5-39 S-SS 

Eang» 


l.I 

X .7 

2.S 

a.8 

3-2 

4-2 



3*3 

4.5 

6.0 

6.9 

7.9 

9.9 

Number Sdtocils 

6 

21 

30 

36 

3* 

3a 


* Frafis, Iht E*wefs liulMitt, him. R*fpMrt of Aeltlrv^mertt Testing Piifr 

grant, Jantmry, im?. 
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FIG. 73. GRADE MEDIANS IN FUNDAMENTAL LEARNING 

SKILLS 


Figure 73 indicates that skills in arithmetic fundamentals 
and language are relatively weak, some grades being more than 
a whole year below the standard. Such a situation may reveal 
the advisability of an analysis of the specific weaknesses which 
have yielded these results. 

P'igure 74 presents, in terms of per cents of failure, a sum¬ 
mary of the learning difficulties experienced by ten per cent or 
more pupils in arithmetic fundamentals for the school system 
as a whole, from which we see that thirteen per cent of those 
tabulated failed in carrying in the process of addition, nineteen 
per cent failed in handling zeros, and so forth. Similar studies 
may be made in reading vocabulary and comprehension, lan¬ 
guage, and other factors related to learning or adjustment. 
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When a significant percentage of the pupils of a school or school 
system fails on a given objective or group of objectives, special 
attention to these objectives is indicated.. This often involves 
such problems as a redetermination of grade placement of 
objectives, a reconsideration of materials, or a re-examination 
of educational procedures. 

The adequacy of mastery of the basic skills for classes or 
schools may be judged in the light of intelligence levels of 
classes and schools. Table 53 indicates expected mastery for 
various levels of mental maturity. Plus and minus data indi¬ 
cate fractions of the school year above and below norm. 


Table 53. Relation Between Mental Maturity 
AND Skill Mastery * 


Class and School Medians 

Per¬ 

centile 

Rank 

Intelli- 

gence 

Quo¬ 

tient 

Intelli¬ 

gence 

Grade 

Place¬ 

ment 

Read¬ 

ing 

Vocabu¬ 

lary 

Read¬ 

ing 

Compre¬ 

hension 

Arith¬ 

metic 

Reason¬ 

ing 

Arith¬ 

metic 

Funda¬ 

mentals 

Language 

99 . 

II4 

+-94 

+ .94 

+ -90 

+ .93 

+ 1.08 

+ .93 

95 . 

110 

+-74 

+ .72 

+ .81 

+ .60 

+ .85 

+ .76 

90. 

108 

+ -61 

+ -46 

+ .62 

+ .42 

+ -70 

+ .56 

80..... 

I OS 

+-37 

+ -30 

+ .38 

+ .29 

+ -43 

+ -35 

70. 

103 

+.26 

+ .20 

+ .18 

+ .18 

+ .27 

+ .21 


roi 

+•13 

- 4 - .10 

+ .08 

+ .09 

+ -13 

-1- .10 

KSMi 

100 

.00 

.00 

.00 

.00 

.00 

.00 

40. 

98 

— .10 

— .09 

— .16 

- .17 

- .27 

— .12 

30. 

96 

— .18 

— .16 

- .29 

- -34 

— .20 

— .21 

20. 

94 

—,27 

- -33 

- .40 

- -59 

- -39 

- -34 

10. 

89 

—.50 

- -53 

— .60 

- .87 

- -57 

- -54 

5. 

S 3 

— .60 

— .62 

- -79 

— 1,02 

- -70 

- .67 

1. 

75 

—.88 

-1.38 

— 1,83 

—2.52 

- .79 

— r.36 


As a result of the revelations of a measurement program, 
school systems frequently modify time schedules, materials, 
and teaching-learning activities in the attempt to overcome 
deficiencies- 

In addition to revealing grade placements in learning skills 

* From tlie Manual of tha California Tests of Mental McUurily, Advanced Battery. 
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and presenting tlie specific difficulties, reports of measurement 
programs may indicate the extent to which skill mastery is 
related to the attainment of other outcomes. Table 54 pre¬ 
sents a number of correlations between the scores of skill tests 
and various subject tests, and total content battery scores. 

Table 54. Correi-ation.s Between Skill 
Mastery ani> Other Outcomes 


Si-.- ,,r r 

/ 

Above ,84 

2 

.80-.84 

2 

■ 75~-79 

3 

.70-.74 

0 


1 

.60--.64 

5 

. 5 . 5 “ 59 

l 

.. 5 U “.54 

0 

• 45“.49 

X 


Correlations between skill tests and content batteries are 
higher than between skill tests and shorter subject tests. 
Since there is cfaisiderable variation i>e?lwet»n tests, actual 
relationship for a given district or school should l>e reported. 

Locally made tests, both of the skill and other outcomes- 
viirieties, are frtiquently used over so long a period of time that 
they lose their effectiveness. From a too long use, pupils be¬ 
come familiar with them- Sometimes, also, stray copies find 
their way to students, so that test results are not valid. The 
testing program should l;>e mtKlified from time to time to avoid 
these as well as other difficulties. 

Reporting achievement. As long as subjects continue over 
a series of grades, evidence of achievement may be reported, 
in grade placements. But when subjects are studied for one 
semester or one year only, and not always at the same grade 
level, a different problem presents itself. 

A single subject or type of experience is sometimes reported 
by schools, often on tlie basis of a mngle tot, as illustrated in 
Table 55. 
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Table 55* Ninth-Grade Algebra Medians by Schools 


ScHooi. 

Median 

I 

128 

2 

129 

3 

130 

4 

13I 

5 

132 

6 

137 

7 

138 

8 

139 

9 

141 

10 

142 

II 

143 


Such a report appears to show wide differences of achieve¬ 
ment in algebra. Yet these schools average but three classes 
of twenty-five or thirty students each, and S.D.’s of the dis¬ 
tribution of scores reach twenty to thirty points. Hence the 
standard errors of difference are approximately three points; to 
be significant, differences between schools must be at least 
eight or nine points. 

Sometimes it is desirable to compare relative achievement 
in a number of subjects. Different tests with differing num¬ 
bers of items are often used and, although a norm may be 
available for each, direct comparisons are difficult. In such 
cases it is advisable to transmute scores into more comparable 
units in order better to reveal these relationships. For this 
purpose, scaled scores or T-scores may be utilized. While dif¬ 
ferent methods of scaling are utilized in connection with differ¬ 
ent tests, all scores of subjects to be compared should be 
scaled in the same manner, and the scaling of total scores, uti¬ 
lizing the T-scale method, is recommended. 

Figure 75 illustrates comparative achievement in six sub¬ 
jects by means of such scaled scores. 

When such scaled scores are utilized, relationships may be 
grasped more quickly and more accurately. Definitely below- 
standard work then justifies a study of pupil ability, materials. 





FIG. 75. COMPARATIVE ACHIEVEMENT IN DIFFERENT 
SUBJECTS 


procedures, and other factors which might account for this 
result. 

Marks should not lie used to report achievement- If, how¬ 
ever, marks are awarded and re{K>rted, studies of the variation 
of marking lx‘tw«m departments and schools may well be 
included. For example, Tabic; 56 presents the per cents of 
marks of each kind given by the departments of all high 
schools of a city comldned. 

Table 56 reveals wide variations in marking practices be¬ 
tween departments, and a study of valid appraisal techniques 


Table 56. Variations in Marking by Departments 
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15 
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4 
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26 

25 
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15 
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24 

20 

23 
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40 
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with scaled scores is indicated. Schools vary in marking prac¬ 
tices in a similar manner, and many misconceptions regarding 
relative qualities of work in various schools arise because 
teachers, supervisors, and administrators still take marks at 
their face value. 

Table 57 presents the marking practices of the seven high 
schools of a city, in achievement in English. 


Table 57. Variations in English Marks by Schools 


Makes 

Bkb. Cents by Schools 

t 

2 

3 

4 

5 

6 

7 

A. 

9 

10 


14 

23 

48 

5 

24 

41 

31 

4 


10 

B. 

27 

48 

16 

0 

25 

50 

10 

26 

38 

27 

6 

7 

27 

36 

25 

C.. 

D. 

13 

40 

29 

8 






5 


Table 57 shows wide variations between schools in awarding 
marks in English. School 5, which awards most high marks, 
is next to the lowest in average I.Q., and a significant difference 
in intelligence exists between schools 5 and 3, in favor of 
school 3. These data should also interest teachers in expressing 
achievement in more valid manner. This amateurish method 
of indicating achievement and promoting and graduating 
students should be eliminated. 

Reporting other outcomes. Individual tests of intelligence, 
and reading readiness tests, are utilized in the first grades. 
Teachers as well as administrative and supervisory officers are 
often confused regarding the actual values of such tests. It is 
not a question of their being either good or bad, but the extent 
to which these tests aid. For the most part, they predict read¬ 
ing ability more accurately than it can be judged or estimated. 
However, as a rule, predictions are not as accurate on new 
groups as with the group on which such tests are standardized. 

Table 58 presents the theoretical and actual prediction of 
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readiiig ability tnacie witb ths of the Monro© Reading 
Aptitude Tests,* 

Taslk sB. Readikg Aptitude Predictiok 
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Table 59. Mean Scores of Selected Students of Five Colleges 


COIXKGK 

Theorbti- 

CAl. 

Economic 

Aesthetic 

Social 

Political 

Religious 

I 

33-1 

35-4 

25.2 

28.8 

30.9 

25-9 

2 

29.1 

36-7 

27.1 

27.7 

34-2 

23.8 

3 

41.2 

22,8 

36.1 

30-9 

24.1 

23.2 

4 

28.1 

27,8 

33-5 

34-2 

28,1 

30.1 

5 

26.3 

24.1 

24-3 

32.4 

27.1 

1 

43-2 


Teachers face many practical problems in which measurement 
may aid them. Definite, identifiable data, such as absence, 
tardiness, and various types of undesirable behavior, may be 
recorded and tabulated. Comparisons between schools reveal 
extreme variations, as well as provide a basis of judgment re¬ 
garding the seriousness or normality of a given situation. 

When these data are returned to teachers and the most diffi¬ 
cult problems revealed, teachers may be asked to report their 
most successful methods of dealing with them. These may in 
turn l>e tabulated and reissued as bulletins to all teachers. 

Summary. In the past the results of the testing program 
have been obtained and reported, if at all, in too fragmentary 
a manner. Neither teachers, supervisors, nor administrators 
have understood the relationship between different types of 
diagnosis and appraisal or their implications in improving ob¬ 
jectives, materials, procedures, supervision, and administra¬ 
tion. Non-technical reports utilizing the results of the 
measurement program will center attention on pupil outcomes, 
and will result in a more co-operative and intelligent attack on 
school problems. 

QUESTIONS AND EXERCISES 

1. What are the major functions of the measurement program report? 

2. Explain the uses of mental-maturity data presented. What other 
uses would you add? 

3. How would you study failure data on the basis of per cents of 
failure? 
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CHAPTER NINETEEN* 


QUESTIONS AND ANSWERS ABOUT 
TJESTS AND MEASUREMENTS 


The IjH^inning student in measurement faces several diffi¬ 
culties. Many of the concepts as well as much of the termi¬ 
nology is new. He often comes with definite prejudice and 
biases for or against measurement, sometimw without founda¬ 
tion, and often liastxi on a different interpretation of the same 
evidence on which tlie case for measurement rests. The liter* 
ature of the field is frequently abstract in nature, certain sc*em- 
ingly familiar words are used in their technical ralhc*r than 
their general sense, and occasional di»J»greemt*nta l»tween 
authorities combine with other factors to increase the stu¬ 
dent’s difficulties. 

Origin «tttd orgamxatioix of questions. io>f these and other 
reasons, the w’riter has, over a p<?ri<»d rif ten years, en¬ 

couraged his students to commit to writing such diffirullitts as 
they have met with in the prtqiaration of tlieir work, or ex}>er!- 
enced in using measurement in Iheir professional aclivitifs. 
This chapter includcjs the questions most freqiiffntly asked, 
together with brief answers. For convenience, they are organ¬ 
ized into five groups, each representing a ma|ar asfKvt of the 
nature or u» of measurement, as follows: 

r. CJeneraJ phil<»ophy of measurement 

2, I nteliigenoe, or mental maturity 

3, Informal object we tests 

4. Standards Etd tcnis 

5. Specific iiK« of teat* 

«. Ability-grcniping and cla»ification 

b. Curriculum and procedur« 

c. Teaching succs» 

d. Marking and promotiem 



QUESTIONS AND ANSWERS 457 

I. GENERAL PHILOSOPHY OF MEASUREMENT 

1. Question. How can scientific measurement ever be used 
successfully in connection with a growing changing organism 
whose principles of growth differ so much from the behavior of 
inert matter, the basis on which most scientific method has 

been developed? 

Answer! The scientist in education does not attempt to 
measure the fact of change as such; instead he measures the 
status of an individual at different periods with respect to cer¬ 
tain selecrted factors, and determines the nature, rate, and sig¬ 
nificance of cdtanges by comparing the different measurements. 
However, physical sciences are concerned not only with inert 
matter, but with changes in matter as well. The best an¬ 
swer to this question is the abundant evidence which shows 
that educational measurements are being used successfully in 
many ways in guiding the growth and adjustment of children. 

2. Question : Are not some phases of learning too subjective 
ever to l>e measured by objective tests? 

Answer; “Phases of learning” is a vague generalization 
which we when we do not know just what we mean. As 
we clarify the nature of the reality, if any, which corresponds 
to our vague feelings, and the evidences which indicate the 
prcsena; or absence of this reality, we are increasingly able to 
measure it. 

3. Question; Does not the value of measurement depend 
largely upon one*s philosophy of education? 

Answer; If the purpose of education is to effect desirable 
changes in individuals, measurement is independent of philos¬ 
ophy; It is simply a method which increases the accuracy of 
information concerning the nature and rate of change. Meas¬ 
urement is also independent of the psychology of learning 
which guides educational activities; it is concerned with what 
changes have occurred, and not the manner in which they were 
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•ION: HOW can we justiiy the use of the same meas- 
strument on different children when each child 
differs in many ways from every other? 

Answer: It is by using the same instrument (the same 
standard) that we discover the nature and extent of these dif¬ 
ferences, and use the information to guide further growth and 
adjustment. We use the same yardstick to measure rooms of 
different skes, and the same scale to weigh children of different 


children •when they differ so from each other? 

Answer: The norm is merely a pt»int of reference to indi¬ 
cate thestatusof an individual with resptxt to the; measurement 
being made. It indicates the status of children in general with 
tc 

the position of an individual in comiNiriscm with the group as a 
whole. There is nothing alxiut the nature of the norm or of 
measurement which requires any individuaf fjopil to r€»ach or 
exceed it. Thus measurement may reveal that pupils finishing 
the sixth grade can spell, on the average, seventy dour wairds 
of a list of lOO, but that a particular pupil can s|'M!ll only 

other fac 


feg to learning, m 

ig to spec' 
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7 * itot B. traciitiona .1 essay examination a better 

learning device, if not as good a measurement instrument, as is 
the objective examination? 

Answer: An examination is not, strictly speaking, a learn¬ 
ing device; in general it reveals what learning or adjustment 
has occurred. As such, it may be used as the point of depar¬ 
ture or gukte to the direction of further learning and adjustment. 

Neither is the traditional essay examination a measure¬ 
ment instrument, l>ecause it cannot be marked reliably; differ¬ 
ent markers regularly assign marks which differ from thirty to 
fifty points on a loo-point scale. 

The essay examination has value as a diagnostic device. 
It also is widely used by those who believe that education con¬ 
sists of reproducing the teacher^s lectures or assigned readings. 

8 . Question ; If measurement is of such value in education, 
why is it that many educational authorities do not use it, and 
some w’ho formerly used it have given it up? 

Answer : The application of scientific methods to the di¬ 
rection of «lucational activity and the appraisal of educational 
outconus is comparatively new. The same wide differences 
of opinion exist€‘cl when the physical sciences were in their 
infancy. In general, those who are not trained in measure¬ 
ment do not iMdieve in it and cannot use it; some are actively 
hwtile to its claims. Others have attempted to use specific 
instruments without training, often for purposes for which they 
wwe not inlended. Still others have condemned measurement 
instruments because they produced results at variance with 
what they Wieved to be the facts. They have more faith in 
their own opinions, which have repeatedly been demonstrated 
to be unreliable, than in an instrument of known reliability. 
In many cases, honest differences of opinion exist concerning 
the meaning and significance of particular instruments. 

However, measurement merely supplies more accurate 
for leM accurate information. It does not supplant teacher 



items, as in the objective test, violate the tenets of modern 
psychology which deals with larger rather than more detailed 
learning situations? 

Ahswkei We must distinguish Ijetweeri the learning 
process and the evaluation of outcomes. In the learning 


it should be a sampling of sjaTific items reiatcfd to sf>eclfic 
■dements which are functionally rdat«i to each other and to 
tlie larger situation or situations. 

10. Question: Now that we have Progressive blduratlon, or 
learning based upon a new klea, is it not true that the ne«i for 
measurement has pasmi is at least much le-ss significant? 
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Ass'Wi'ER. Tli^y carinot. Such, reports are merely esti¬ 
mates. 

2. QjtJlvSTloN. How high must an I.Q. be before one can be 
sure that a student should pass? 

Answer . Students are not passed or failed on the rating 
of their I.Q.’s, but on their achievement in the light of a num- 
l>er of other factors, including I.Q. While in general students 
with higher I.Q.’a learn and adjust better than pupils with 
tower l.Q.\ the LQ, of a particular pupil will not predict his 
achievement. 

3. Question; If the correlation between LQ. and achieve¬ 
ment, cxmtrary to general opinion, is not high, why give an 
intelligence test at all? 

Answer: It is much more important to know how to help 
a pupil in difficulty than to be able to predict his probable suc¬ 
cess. Other types of tests and observation of the pupil often 
reveal in what respt*cts he is failing to learn and adjust, but 
do not r«*veal why. An individual intelligence test, or a 
group tot which breaks down the LQ. or M.A. (each of which 
is a composite), reveals the manner in which pupils may be 
aided most effectively. 

4. Question : Are not intelligence tests largely composed of 
achievement items? If so, why not give only achievement 

teste? 

Answer: Yes* most items of intelligence tests involve 
learning or previous achievement, and if the major purpose of 
such tests were to predict achievement, achievement .tests 
would also yield predictions, some of them much higher than 
intelligence tests; but the major purpose of intelligence tests 
is to reveal why pupils are in difficulty, and how best to aid 
them. Average achievement for a group may be predicted 
from averi^e LQ. with a high degree of accuracy. 

5. Question: Does not the lack of constancy of the LQ. 
make it practically u«lew for' any purpose? 
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has an opportunity to react in terms of his environment, and 
the total scores and resulting LQ,'s are relatively comparable 
On the other hand, such factors as memory, perceptual ability, 
and reasoning alnhty, which may be obtained from individual 
tests and some group teats, are relatively independent of 
t»nv iron merit and more useful in aiding students. 

*h What would you do when a parent insists on 

having a student with an I.Q. of 90 take Latin and Mathe¬ 
matics? 

Answeh: I should advise against it on the general prin¬ 
ciple that the LQ. is relatively low. If the parent persisted, 

I should remember that the relation between I.Q. and academic 
mcc&m may be very low in a given instance. If the pupil 
experienced difficulty, I should give an intelligence test which 
broke dowm the LQ. and revealed why it was high or low and 
how best to aid the pupil. 

to. Question : Does the statement that the correlation be¬ 
tween I.Q. and achievement is only thirteen per cent better 
than chance mean that it is the same in all subjects? 

Answer: No. This is the average. In some subjects 
or activities, such as reading, it is much higher; in others, such 
as WTttiag, it is often negligible. In quoting correlations, 
never accept the results of one study. For excunple, the the- 
ontfically true correlation lietween I.Q. and reading would be 
the average* of an infinite number of different measurements. 

11. QuE^»fTrlN: Dc»es not the boy who has high mechanical 
ability but low* paper-and-pencil ability have the same claim 
to high intclligenoe a» the student who does well on an intelli- 
gence test? 

Ansitsr: No. Intelligence is a general concept, and the 
I.Q. has OTime to accepted as the measure of ability to do the 
general academic type of work characteristic of most class¬ 
room*. It is better to designate the former as high specific 
aptittiiJe In mechanics. However, one is as honorable as the 
other. 
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3. INFORMAL OBJECTIVE TESTS 

1. Question: What does the designation informal mean 
when applied to objective tests? 

AnsW’ER : The term informal means here that the test has 
not tjeen rcffmed or validated like a standardized test. It is 
also known as the teacher-made objective test. The items 
are usually selected informally, and from a more limited range 
of educational exiKTienre. The scores on such a test are often 
interpreted informally in relation to the average for the group, 
but its ma|or pur^xiBe is usually to identify the specific ele¬ 
ments of a learning situation which may or have become diffi¬ 
cult. It is also known as a diagnostic test, and may l>e given 
before the work of a course, unit, or is begun, when it is 
knowm as a pre-test, or it may l>e given periodically as other 
educational activitk‘s are in progress. 

2. Question: How may one l,>e sure that each item of an 
informal objective or diagnostic test represtmfs an important 
functianal element of a learning or adjustment situation? 

Ansiver: This is primarily a matter of curriculum mak¬ 
ing. The more exactly teachers know what they are trying 
to do, that is, the more accurately the important si?ecific ob- 
|«:ti-vesi relatrxl to more gtmeral aims and purposes have been 
cleterminecl, the greater the functionality of the items. The 
items of a diagnostic test are simply a sampling of these impor¬ 
tant specific, but functionally related, objectives. However, 
the functionality of a specific oVijective is not a simple matter; 
it is conditionf*d by the ability and previous experience of the 
learner, as vrell as by the nature of the specific element itself. 
There are wide* differences l>etweeii individuals in the capacity 
to perceivfi whole situations, as well as to identify the specific 
elements wJiich constitute an integrated whole. 

3. Question: How^ can we l>e sure that the items of a diag- 
liMiic test are cif ttriual significance or are properly weighted? 

Answer: We need not be concerned wdth either of the» 
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factors in brief informal tests. Because specific functional 
elements of a learning situation are important does not mean 
that they must be of equal significance, if they are unequal, or 
that their relative significance must somehow be evaluated. 
All that is necessary is to attempt to identify those specific ele¬ 
ments each of which have or may become obstacles to learning 
and adjustment. 

4. Question: Is it good policy to construct all informal ob¬ 
jective tests before beginning to teach a course, unit, or topic? 
Will the teacher not then stress what is in the tests? 

Answer: It is an excellent practice to construct informal 
diagnostic tests beforehand- This requires a careful study of 
general aims and purposes, as well as of the related functional 
specific elements which are involved in their attainment. 
Such tests can then be used as pre-tests, also. If the teacher 
understands the function of informal diagnostic testing, she 
will not attempt to direct learning and adjustment by mere 
attention to specific elements. If the teacher is of the type 
who drills for the mastery of specific elements in the hope that 
by adding them she will arrive at general aims and purposes, 
she will at least have given attention to the most important 
elements involved. Somehow, the integrated organism of the 
normal child seems able to compensate to a large extent for 
theoretically incorrect procedures. 

5. Question: Is it better to use informal objective tests or 
standardized tests in teaching with units and activities? 

Answer: Both types are needed, the informal type to 
guide learning and adjustment, and the standardized unit-test 
to appraise the total outcomes of each unit. Units and activi¬ 
ties are simply organizations of materials and procedures to 
attain certain desirable outcomes. However, when procedures 
which are new to a particular generation of teachers make 
their appearance, teachers are so occupied with imitating their 
outward forms and discussing their alleged merits that they 
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9* Questions Is it correct to give informal objective tests in 
oral form? 

Answer: Short tests of the one-word-answer, true-false, 
or yes-no type may be given to advantage orally. The scores 
may be tabulated by items and the results immediately used 
as a point of departure in teaching and learning. More com¬ 
prehensive tests, using a greater variety of test-forms, should 
be mimeographed or printed. 

4. STANDARDIZED TESTS 

1. Question : What is the difference between a survey and 
a diagnostic standardized test? 

Answer: Survey tests have as their principal function 
the appraisal of the general success of students, usually in 
terms of grade placements. The fields covered as well as the 
grade range are frequently very wide, hence the sampling is 
not very intensive. 

The diagnostic standardized test is usually a more in¬ 
tensive sampling, since its items represent important specific 
functional elements which may or have become obstacles to 
learning. This usually indicates a more careful selection of 
items. Grade placements and other normative data are de¬ 
rived exactly as in the case of survey or inventory tests. In 
addition, diagnostic tests are usually organized for the con¬ 
venience of the teacher, including profiles, analyses, and other 
aids useful in interpreting and using the results. 

2. Question : How can standardized tests be used with the 
unit-and-activity program, when each student may be engaged 
in a different activity? 

Answer: As tong as teachers believe that it does not 
matter what the child does if he is active and happy it will be 
difficult to use standardized tests to evaluate certain outcomes 
of unit teaching. However, if the purpose of all education is 



i^rade-placement data even thougn the iiurnwr ot ttems may 
vary from test to teat. The forma of a test are camparabJe if 
they yJeki the same grade placements or other normative 
results, even though the raw or original scores are different. 

4. 'Questjon : Doe?s not the use of list* same »tandardi2;ed 
test year after year control the curriculum? 


Answer: It dej>end» upon the tyja? of educational activi- 
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may att6mpt tlie direct mastery of each subject-matter out¬ 
come of a particular test. Standardized tests should be 
revised from time to time in harmony with changing objectives 
and changing emphasis on objectives which are retained over a 
period. 

5. Question: If one student obtains twice as high a score 
as another student on a standardized test, has he achieved 
twice as much or is he twice as bright? 

Answer: No. The items of a test are not of exactly 
ecjual difficulty or merit. Frequently one score point is given 
for each correct response, or items are otherwise scored as if 
they were equal in these respects. For this reason, and 
others, raw or original scores are generally transmuted into 
grade placements or scaled scores to increase their compara¬ 
bility and usefulness. 

6. Question: If informal tests are most useful to guide 
activities, why use standardized tests at all? 

Answer : The informal test keeps the educational activity 
properly directed to unattained objectives. In general, 
standardizfcNl tests are appraisal instruments given periodi¬ 
cally to show how successful has been this direction of learning 
activity. In addition, standardized tests make it possible to 
compare such factors as general levels of achievement with 
ability, to compare schools and classes, and to study the 
effectiveness of materials, procedures, and equipment. 

7. Question: What assurance have we that a standardized 
test is ever given under exactly the same conditions a second 
time? 

Answer: It is impossible ever to reproduce exactly the 
same conditions which existed previously. All that is neces¬ 
sary is that pupils are ready, have space and equipment with 
which to work, that the room is sufficiently quiet, and that 
the teadier gives the directions faithfully. 

8 . Question: How low a reliability may a test have and 
still be considered reliable? 
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still differ from each other in many ways. For the most part, 
ability grouping is a mere convenience in classroom manage¬ 
ment, and apparent gains in learning and adjustment are due 
to factors other than grouping. 

2. Question: Is it not true that grouping has a very bad 
effect on personality development of children? 

Answer: Since no significant degree of homogeneity is 
ever attained, except at the extremes, it is not the fact of 
grouping but the attitude of teachers which is most harmful. 
As long as teachers feel that a pupil with an I.Q. below .90 or 
.80 is hopeless, pupils will detect the fact. In reality, the cor¬ 
relation between I.Q. and success in any given case is very 
uncertain. More time should be spent in analyzing the I.Q.’s 
and M.Ads of children to discover why they experience diffi¬ 
culties. Almost all children can learn and adjust well if their 
teachers know how to aid them. 

3. Question: Does not the slow child learn more by being 
exposed to the activities of superior children than by being 
kept in a slow group? 

Answer: Research has discovered no statistically reliable 
differences between the achievement of grouped and un- 
grour>ed classes. While mere grouping as such appears to 
have no significant effect on learning, it is still true that groups 
of high average ability will achieve more than groups of low 
avei*age ability. 

4. Question ; Should separate norms be established for each 
group when ability grouping is used? 

Answer : So many factors are used in ability grouping, so 
much overlapping exists, and so many changes occur that in 
general it would be practically impossible. However, where 
groups are divided on the basis of some standard score such as 
an M.A. or I.Q., average achievement .for groups can be 
predicted from their average ability. Hence norms for pupils 
below 75 I.Q., or for groups averaging 90, 100, and so forth, 
ire relatively stable. 
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philosophies and uses activities which vary from the most tra¬ 
ditional to the most modern without any particular awareness. 
Studies of the relative merits of different methods rarely reveal 
any significant differences because of overlapping and conflict¬ 
ing activities. For this reason, teachers should give more 
attention to individual pupil-learning and adjustment as 
revealed by diagnostic tests and observation. Children learn 
and adjust under a great variety of situations, provided teach¬ 
ers aid them in identifying and solving their difficulties. 

4. Question: Is the unit-and-activity method better than 
most others? 

Answer: Strictly speaking, there is no one such method, 
but dozens of such methods. Units have always been used, 
many of them well developed. Teachers have always used 
reading, special reports, topics, collections, map drawing, 
excursions, and the like in their classwork; these are the activ¬ 
ities of progressive education. The modern movement has 
contributed a new emphasis and further light on the desirabil¬ 
ity of using a natural and effective procedure for attaining cer¬ 
tain outcomes^ namely, beginning with wholes, or larger learn¬ 
ing situation, and differentiating their character elements, 
instead of beginning with elements and attempting to synthe¬ 
size them into wholes. New light concerning the unity of the 
functioning organism, with implications for method, has also 
been produced. However, many of these contributions have 
much greater academic than practical significance. Children 
learn and adjust in spite of theoretically wrong methods, and 
expetience difficulties when procedures are theoretically correct. 

c. Teaching Success 

I. Question : What is teaching success? 

Answer; Theoretically, teaching success is success in 
aiding students to learn and adjust. Practically, it is receiving 
the approval of the judges, often for reasons not closely related 



to my actual ooutrifoution. It Is Im 
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unfavorable marljs of school officials prove they are poor teach¬ 
ers. Because of the difficulties of measurement involved, and 
the variation in opinions of what constitutes good teaching, 
such judgments suffer from the same unreliability as marks on 
traditional examinations. Most raters largely ignore actual 
pupil outcomes, perhaps the most important evidence of all. 

5. Question: Of what use are intelligence tests in determin¬ 
ing teaching success? 

Answer; They are of almost no use. There is no high 
correlation between I.Q. or M.A. and the criteria now used as 
evidences of success. This may be due, among other factors, 
largely to the criteria, or it may be due to the fact that most 
teachers are much more intelligent than teaching requires. 
The unreliability of the criterion alone is sufficient to produce 
low correlations. However, intelligence data, if related, could 
reveal only ability. Success would depend on the actual con¬ 
tribution of the teacher. 

6. Question : What is the value of having pupils rate teach¬ 
ers, or having teachers rate themselves? 

Answ'ER ; This rating procedure is interesting and divert¬ 
ing, but of no particular significance. It is hardly possible that 
untrained pupils, especially, can do what trained experts find so 
difficult. Officials find out what pupils think of teachers, and 
what teachers think of themselves, but this is no valid measure 
of the actual contribution of teachers to pupil-learning and 
adjustment. It is quite possible that some ineffective teachers 
may have engaging personalities, and some otherwise good 
teachers, certain i>ersonality defects. Pupils should learn to 
adjust to a variety of personalities, just as they must outside 
the classroom. 


d. Marking and Promotion 
t. Question: How do marking and scoring differ? 

Answer: Marking is judging the adequacy of pupil 



responm to a question, problem, or other task In an informal 
manner; scoring is determining the numlx'r of correct re« 
spon«€S to ait oblcTtive t<»st and adding them, weighted or tin- 
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TThs srnallfir the clciss, other things being equal, the better the 
opportunity for maximum individual attention and learning. 

If marks are awarded, they should be based on actual 
learning and adjustment. However, to the extent that teach¬ 
ing is vague and is not planned in terms of individual needs 
and capacities, outcomes are likely to follow a normal distribu¬ 
tion. The degree to which a group constitutes such a normal 
distribution will determine the extent to which marking based 
on this concept will reflect actual achievement, provided marks 
are based on objective data. However, there is nothing about 
the nature of the normal curve which requires that anyone 
should fail. 

8. Question : How are the standards of a school affected 
when scjme teachers use objective tests and some essay exam¬ 
inations? 

Answer; Both types have diagnostic value, but objective 
tests provide better guidance for teaching and learning in many 
fields. Those using objective tests will also have better evi¬ 
dence of the quality of work done. 

9. Question: How should one mark a pupil in history, or 
any other subject, who shows a deliberate lack of effort? 

Answer: If at all, on achievement in the subject con¬ 
cerned. 

10. Question : Should not a mark in arithmetic or reading be 
lowered if the citizenship grade in that class is exhausted by 

demerits? 

Answer; If awarded, a mark in a subject or a skill should 
as far as possible represent actual achievement. To lower an 
achievement mark because of a personality difficulty is not 
only unfair, but reveals a serious lack of understanding both 
of marking and of child guidance. Marks become “ex¬ 
hausted “ because of the crude and arbitrary demerit systems 
in u». More attention should be given to the causes and 
cures of personality difficulties. 
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ment of Test Construction, 330 n. 
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